Semi-quantitative FFQ are valid and reliable dietary assessment methods used worldwide on adolescents and are suggested as appropriate tools for the collection of dietary intake data in large-scale surveys( Reference Ortiz-Andrellucchi, Henríquez-Sánchez and Sánchez-Villegas 1 , Reference Cade, Burley and Warm 2 ), since they have the advantages of ease of administration, saving of economic resources and ability to assess dietary intake over an extended period of time( Reference Subar 3 ). Among all the used FFQ, large variations in design characteristics have been highlighted( Reference Molag, de Vries and Ocké 4 ), such as number of food items or consumption interval.
Our recent systematic literature review( Reference Tabacchi, Amodio and Di Pasquale 5 ) identified the FFQ used in adolescents and validated during the last decade throughout the world. One of the aspects emphasized by the review is that there is an ongoing need for the refinement of existing approaches, especially ones that can be used in large epidemiological studies.
When preparing the tools for dietary data collection, the specific design and validation issues of the data collecting instrument have to be taken into account. There are many factors that may affect the accuracy of a dietary questionnaire such as respondent characteristics, questionnaire design and quantification, adequacy of the reference data, quality control and data management( Reference Serra-Majem, Frost Andersen and Henríque Sánchez 6 ), including the statistical analyses of validation data. This leads to the necessity to further characterize or create new FFQ targeted to adolescents to address the need for a valid, reproducible, user-friendly, fast, cost-effective, standardized method of accurately assessing nutrient intakes in adolescents.
The ASSO Project (Adolescents and Surveillance System for the Obesity prevention), funded by the Italian Ministry of Health and involving different national and international partners, aims at developing an innovative web-based system for a standardized collection of data on food consumption and lifestyles in adolescents( Reference Tabacchi 7 ). To this purpose, valid and reliable instruments are envisaged to be developed within the Project( Reference Tabacchi and Bianco 8 ), including a questionnaire for the assessment of food consumption and nutrient intakes. Our previously mentioned review suggested the development of a new semi-quantitative FFQ that could fit the purposes of the ASSO Project. In order to establish the design of an appropriate FFQ that could provide valid data, the present work was aimed at conducting a meta-analysis of the validity studies of FFQ specifically addressed to adolescents. The overall degree of correlation and agreement between FFQ and the reference method was assessed and variables that can affect FFQ validity identified.
Methods
Systematic literature review
A systematic literature review was recently performed by the authors on studies describing dietary assessment methods in adolescents published worldwide between 2001 and 2012( Reference Tabacchi, Amodio and Di Pasquale 5 ). The electronic databases MEDLINE, EMBASE, ISI Web of Science and Cochrane were explored. In the MEDLINE and Cochrane databases, besides free text terms, Medical Subject Headings (MeSH) and MeSH Major Topics were included in the syntax. A sensitivity check was executed by deleting terms in the syntax systematically to see if important articles were missed with the current syntax. Publication language was restricted to the English, Italian, Spanish and French idioms. Key search terms, used alone and in combination, included the following: terms referred to the type of dietary method (questionnaire, 24-HR, 24 h recall, 24-h recall, FFQ, history, record, diary); terms including diet, nutrition, food, intake; terms related to the validation of the methods (validity, validation, accuracy, accurate). Additional searches were carried out on websites of national and international organizations (e.g. university websites and relevant professional societies or organizations) and the grey literature was also considered. The studies that used biomarkers were not considered since they often reflect status rather than intake, short-term rather than long-term intakes and are invasive and expensive( Reference Lampe and Rock 9 ). The reference lists of articles retrieved for inclusion in the review were hand-searched to identify other relevant articles.
Studies that met all of the following inclusion criteria were included in the review: describing dietary assessment methods developed for epidemiological purposes; targeting adolescent populations in the age range 13–17 years; and reporting the validity and/or reproducibility of the method v. one reference method.
The retrieved records were sent to Endnote® (version X 4·02). After removing all duplicates, title and abstracts were screened. When a title or abstract could not be rejected with certainty, the paper was included in the eligibility papers and the full text was further evaluated. The following exclusion criteria were applied: population age not in the range 13–17 years; non-healthy subjects; hospitalized or not free-living subjects; pregnant adolescent women; refugees; vulnerable populations such as low-income or rural; specific ethnicity; overweight/obese subjects; athletes; vegetarians; dietary instrument specific only to certain nutrients (folate, vitamins, calcium, fats, proteins, etc.), specific only to certain foods (alcohol, beverages, fruit and vegetables, sugar snacks, seafood, etc.) or specific only to energy and fast-food consumption; feeding study or intervention study; subjects with eating disorders; study relative to eating or health behaviour; psychometric tests (e.g. for craving); subjects with food allergies; study relative to particular substances intake (acrylamide, etc.); questionnaire only for physical activity assessment; questionnaire only for nutrition knowledge assessment; study aimed at perceptions; study where only parental reporting on their children was considered; study with only food insecurity measurement; and study with only portion size estimation.
The full texts of the articles assessed for eligibility were then examined. Some articles and the relative full version of the questionnaires were obtained through direct contact with the author.
The literature search and the systematic review were conducted by two independent investigators, after a standardization of the procedure. In the case of any incongruity, the two investigators came to an agreement after further analysis and discussion. Further details on the systematic literature review can be found in the published paper( Reference Tabacchi, Amodio and Di Pasquale 5 ).
Data extraction
Data indicating correlation and agreement between the FFQ and the reference method were considered, from each retrieved study, in relation to energy and the following nutrient intakes: protein, carbohydrate, sugar, fibre, starch, total fat, SFA, MUFA, PUFA, cholesterol, thiamin, riboflavin, niacin, vitamin B6, vitamin B12, folic acid, vitamin C, vitamin A, carotene, vitamin D, vitamin E, Ca, Mg, P, K, Na, Fe, Zn, Cu and iodine.
In detail, Pearson’s or Spearman’s correlation coefficients, means and standard deviations, kappa agreement, percentiles and mean agreement/limits of agreement (LOA) estimated through the Bland–Altman method were extracted and analysed. Prior to the extraction, a data extraction form was developed, which was filled by two independent reviewers after an informal training exercise.
Meta-analysis of correlation coefficients and of means/standard deviations
To determine the overall degree of correlation between FFQ and reference method, the correlation coefficients were extracted. In addition, means and standard deviations were also extracted and meta-analysed in order to assess the overall agreement derived from pooling together the populations from different studies. All data were analysed by using the statistical software package STATA/MP 12·1, with the ‘metan’ command used for meta-analysis( Reference Bradburn, Deeks and Altman 10 ).
Pooled estimates were calculated using both fixed-effects and DerSimonian and Laird( Reference DerSimonian and Laird 11 ) random-effects models (that estimates the mean of a distribution of effects), weighting individual study results by the inverse of their variances. Forest plots were used to visually assess the pooled estimates and corresponding 95 % confidence intervals across studies. A test of heterogeneity was performed using a χ 2 test( Reference Fleiss 12 ) at significance level of P<0·05 and reported with the I 2 statistic, in which cut-offs of 25 %, 50 % and 75 % indicate low, moderate and high heterogeneity, respectively( Reference Higgins, Thompson and Deeks 13 ).
When the test showed significant heterogeneity, the sources of heterogeneity were explored with a meta-regression analysis, through a stratification by the following characteristics of the FFQ and of the validation study: reference method, divided into the two categories of food record (FR) and 24 h recall (24-HR); number of food items, with the two classes <114 and ≥114 (where 114 is the median value of the number of food items extracted from the FFQ); administration mode, which includes interviewer-administered (IW) and self-administered (SA) modes; collection setting, as school and non-school environment; consumption interval, with the two categories considered being previous year/6 months and previous month/week of consumption; portion size estimation method, with household units and visual serving sizes; number of subjects, with number ≤80 and >80; and study quality, where low-quality studies were compared with high-quality studies. In order to judge the methodological quality of studies based on the validation characteristics, the authors carried out a study quality assessment( Reference Tabacchi, Amodio and Di Pasquale 5 ), according to the summary score described by Serra-Majem et al.( Reference Serra-Majem, Frost Andersen and Henríque Sánchez 6 ), which classified studies as very good, good, acceptable/reasonable or poor. Since the number of studies is limited, in order to have variables with two modalities, the high and low categories were chosen for the meta-regression.
Sensitivity analyses were conducted to examine the contribution of each individual study by evaluating the impact of the outlier studies (e.g. observations that deviate so much from other observations as to arouse suspicion that they were generated by a different mechanism), eliminating each study from the meta-analysis and comparing the point estimates including and excluding the study.
To assess the potential of publication bias, the Egger test( Reference Sterne and Egger 14 , Reference Egger, Davey Smith and Schneider 15 ) was performed for examining the relative symmetry of individual study estimates around the overall estimate, displaying the results in a Galbraith plot (where the standard normal deviate of the intervention effect estimate is plotted against its precision). To overcome the limit of the Egger test due to the presence of small studies, evidence of asymmetry was set on P<0·1 and intercepts have been presented with 90 % confidence interval, as suggested by Egger et al.( Reference Egger, Davey Smith and Schneider 15 ). According to the suggestion that the use of this test is not reasonable for fewer than ten studies, the analysis included fourteen studies, and the outcome measures were energy, all macronutrient, Ca and Fe intakes.
In detail, the meta-analysis of correlation coefficients was conducted by retrieving all effect sizes in the form of Pearson’s or Spearman’s correlation coefficients, and estimating the pooled effect for energy and each nutrient considered. Pearson’s or Spearman’s correlation coefficients were used respectively when the sample distribution was normal (or transformed into a normal one) and when it was skewed. In some studies the correlation was considered raw; in some others the presentation of results included the adjustment of nutrients for total energy intakes using regression techniques (energy-adjusted values) and/or values de-attenuated from the weakening effect of measurement error. Thus, for each identified FFQ, raw and de-attenuated/energy-adjusted (de-att/E-adj) Pearson’s and Spearman’s correlation coefficients were extracted and the effect sizes of both the raw and the de-attenuated and/or energy-adjusted correlation coefficients were estimated. Following the recommendation by Hunter and Schmidt( Reference Hunter and Schmidt 16 ), correlation coefficients were not transformed into Fisher’s Z scores as this transformation produces an upward bias in the mean estimation of the correlation because of the larger weights given to the larger correlations. On the other hand, this upward bias is usually higher than the negligible downward bias produced by untransformed correlations.
Cohen’s rule of thumb for interpretation of the correlation coefficients was followed: a value of 0·1 indicates a small effect, a value of 0·25 indicates a medium effect and a value of 0·4 a large effect( Reference Cohen 17 ).
The nutrients with less than three correlation coefficient values reported (vitamin D, Cu, iodine, starch, alcohol) were excluded from the analysis. The sex-specific correlation coefficients between FFQ and the reference method were not stated in most studies, therefore we did not include them in the study comparison; when two correlation coefficients were available for males and females their mean was used as the representative value.
With regard to the meta-analysis of means and standard deviations, values for energy, macronutrient and micronutrient intakes were extracted for the test (FFQ) and reference methods (FR and 24-HR) in all the retrieved studies. They were incorporated in a meta-analysis study to estimate the overall effect, expressed as the standardized mean difference (SMD). The SMD was used since the studies all assessed the same outcome (energy and nutrients) but measured it by using instruments with different characteristics. It expresses the size of the intervention effect in each study relative to the variability observed in that study. Cohen’s rule of thumb for interpretation of the SMD statistic was followed: a value of 0·2 indicates a small effect, a value of 0·5 indicates a medium effect and a value of 0·8 or larger indicates a large effect( Reference Cohen 17 ).
Analysis of kappa agreement, percentiles and mean agreement/limits of agreement
Weighted kappa (κ w), which were used as a measure of agreement( Reference Cohen 18 ) between FFQ and the reference method, were extracted. The agreement was classified with the following thresholds established by Landis and Koch( Reference Landis and Koch 19 ): κ w≤0 indicates less than chance agreement; κ w=0·01–0·20 indicates slight agreement; κ w=0·21–0·40 fair agreement; κ w=0·41–0·60 moderate agreement; κ w=0·61–0·80 substantial agreement; κ w=0·81–0·99 indicates almost perfect agreement.
The proportions of individuals classified into percentiles (quintiles, quartiles and tertiles) were extracted in order to evaluate the ability of the FFQ in ranking subjects across levels of nutrient intake.
Mean agreement and LOA estimated through the Bland–Altman method( Reference Bland and Altman 20 ) were also analysed from some studies. This method permits determining the direction of error and estimating heteroscedasticity. If differences are approximately normally distributed and not related to the magnitude of the measures (homoscedasticity), the systematic bias is estimated by the mean of the differences and the random error is estimated by the standard variation of the differences.
Results
Fourteen original articles retrieved through the mentioned systematic literature review and two more papers updated to May 2015( Reference Ambrosini, de Klerk and O’Sullivan 21 – Reference Watson, Collins and Sibbritt 36 ) were identified as studies assessing the validation of FFQ against reference dietary instruments, translating the food intakes into nutrient intakes and targeting adolescent populations in the age range 13–17 years (Table 1). A high variability was highlighted between the studies( Reference Tabacchi, Amodio and Di Pasquale 5 ).
PB, paper-based; WB, web-based; IW, interviewer-administered; SA, self-administered; NR, not reported; FR, food record; 24-HR, 24 h recall; YAQ, Youth/Adolescent Questionnaire; YANA-C, Young Adolescents’ Nutrition Assessment on Computer; 7 d-FRRI, 7 d weighed food record; CC, correlation coefficient; κ w, weighted kappa.
* According to Serra-Majem et al.( Reference Serra-Majem, Frost Andersen and Henríque Sánchez 6 ).
Meta-analysis study of correlation coefficients
The meta-analysis of both raw and de-att/E-adj correlation coefficients showed fair/high correlation between FFQ and FR or 24-HR for energy and all nutrients (Table 2): the overall raw effect estimate was high (correlation coefficient>0·4) for most nutrients, while it was fair (correlation coefficient=0·25–0·39) for sugar, PUFA, cholesterol, vitamin C, vitamin A, carotene, vitamin E and Zn; the overall de-att/E-adj effect size was high for most nutrients, and fair for sugar, MUFA, PUFA, vitamin A and Na.
NA, not available.
However, the heterogeneity was high for energy and most nutrients in raw correlation coefficients, and for half of the nutrients in de-att/E-adj correlation coefficients (Table 2). Homogeneity was found only for raw values of vitamin B12 (I 2=0·0 %, P=0·962) and de-att/E-adj values of SFA (I 2=0·0 %, P=0·984); moderate heterogeneity was found for raw values of carotene (I 2=35·3 %, P=0·186) and for de-att/E-adj values of protein, sugar, MUFA, Mg, P and K (Table 2).
Taking into account both the correlation coefficients and I 2 values, these values were plotted (data not shown) to obtain values with fair/high correlation (>0·25) and low/moderate heterogeneity (I 2<50 %) at the same time: for SFA, MUFA, protein, Mg, K and P, the two methods were well correlated and studies were quite homogeneous.
In order to investigate the factors influencing the high heterogeneity of the de-att/E-adj values, we stratified by the characteristics of the study and of the FFQ. For energy and vitamin A, the stratified analysis did not show any heterogeneity reduction; this indicates that other not observed variables, different from the characteristics of the study and FFQ, could have generated heterogeneity, such as sex, which could not be evaluated in our stratification as very few studies provided data separately for males and females.
For the other nutrients, the heterogeneity was explained mainly by the following variables: IW administration mode and number of food items ≥114. Noteworthy, for total fat, the stratification by administration mode highlighted the IW mode as a source of heterogeneity (Fig. 1).
Meta-analysis study of means and standard deviations
Table 3 shows the effect estimate with the 95 % confidence interval, heterogeneity and P value for energy and each nutrient. A significant very small effect (SMD<0·20) of the FFQ compared with the reference method was found for protein, total fat, PUFA, cholesterol, vitamin A, vitamin E, thiamin, niacin, vitamin B6, folic acid, Na and Fe; a small effect (SMD=0·21–0·50) was found for energy, carbohydrate, SFA, MUFA, riboflavin, vitamin B12, Ca and P.
For sugar, carotene and K, a significant SMD value between 0·51 and 0·80 was found in the direction of an overestimation. A large effect (SMD>0·81) was not found in any of the nutrients.
Sugar, fibre, vitamin C, carotene, Mg, K and Zn showed significant overestimation when measured by the FFQ compared with the reference instrument. PUFA, cholesterol and thiamin showed an underestimation, but the SMD was not significant.
Results referring to the heterogeneity indicated that it was very high for all nutrients except for sugar and vitamin B6 (Table 3).
We explored the sources of heterogeneity for energy and nutrients through stratification by the methodological characteristics of the study and of the instrument used. The sensitivity analysis revealed that the exclusion of outliers from the analyses in most cases influenced the overall results.
With regard to energy, the heterogeneity after stratification remained always high. When the sensitivity analysis by excluding an outlier study( Reference Deschamps, De Lauzon-Guillain and Lafay 25 ) and the stratification were performed, the SMD remained low/medium and the heterogeneity was annulled in SA studies (Fig. 2), was reduced in low-quality studies (SMD=0·29, 95 % CI 0·06, 0·52; I 2=49·6 %, P=0·138) and in FFQ asking for consumption in the previous month/week (SMD=0·11, 95 % CI −0·11, 0·32; I 2=36·6 % P=0·207).
The results for almost all nutrients also showed a significant heterogeneity across the studies. The initial overall effect for carbohydrates was 0·45 and studies showed high heterogeneity. The exclusion of outliers( Reference Arajuo, Yokoo and Pereira 22 , Reference Rockett, Berkey and Colditz 31 ) improved the SMD (0·28, 95 % CI 0·10, 0·46) and decreased heterogeneity, even though it remained at high levels. Stratifying to investigate the sources of heterogeneity, the FFQ with SA mode of administration had lower heterogeneity (I 2=35·1 %, P=0·214) compared with the IW mode (I 2=81·7 %, P=0·000), even though the effect was higher than in the IW (SMD=0·58, 95 % CI 0·34, 0·81 v. SMD=0·18, 95 % CI −0·01, 0·36). The studies with <80 subjects were moderately heterogeneous (I 2=44·8 %, P=0·107) even though SMD was 0·60. Reference method, collection setting, portion size estimation method and quality did not affect the overall effect.
For the intake of fibre, evaluated after excluding the outliers( Reference Arajuo, Yokoo and Pereira 22 , Reference Nurul-Fadhilah, SzeTeo and Huat Foo 29 , Reference Rockett, Berkey and Colditz 31 ), the consumption interval of the previous month/week showed a low effect and a medium heterogeneity (SMD=0·23, 95 % CI −0·02, 0·48; I 2=50·8 %, P=0·131). Stratifying by food items, the FFQ with fewer than 114 items had SMD of 0·21 (95 % CI 0·01, 0·41) and a reduced heterogeneity (I 2=64·1 %, P=0·016). The low-quality studies showed SMD of 0·37 (95 % CI 0·17, 0·56) and I 2=33·1 %, P=0·225.
Concerning protein, heterogeneity remained high even after eliminating the outliers( Reference Lietz, Barton and Longbottom 27 , Reference Rockett, Berkey and Colditz 31 ). The stratification analysis revealed that SA FFQ had a fair SMD of 0·37 (95 % CI 0·11, 0·64) and low heterogeneity (I 2=49·0 %, P=0·139) compared with the IW ones. Also study quality influenced heterogeneity, with high-quality studies explaining the heterogeneity (Fig. 3).
For total fat intake, after the exclusion of three outliers( Reference Arajuo, Yokoo and Pereira 22 , Reference Cullen, Watson and Zakeri 24 , Reference Rockett, Berkey and Colditz 31 ), the heterogeneity was reduced for the consumption interval of the previous month/week (SMD=0·09, 95 % CI −0·13, 0·30; I 2=31·4 %, P=0·227).
Analysing the intake of SFA, after eliminating the outliers( Reference Ambrosini, de Klerk and O’Sullivan 21 , Reference Rockett, Berkey and Colditz 31 ), an overall medium SMD (0·40) was observed and heterogeneity decreased (I 2=52·0 %, P=0·100).
In the sensitivity analysis for PUFA, the exclusion of one study( Reference Rockett, Berkey and Colditz 31 ) did not modify the overall heterogeneity. Stratifying by study quality, low-quality studies had a low effect and were homogeneous (SMD=0·25, 95 % CI 0·09, 0·41; I 2=0·0 %, P=0·595).
After exclusion of the outliers( Reference Martinez, Philippi and Estima 28 , Reference Rockett, Berkey and Colditz 31 ) and sensitivity analysis for MUFA, the effect remained low/medium and the heterogeneity decreased for SA administration mode (SMD=0·31, 95 % CI −0·12, 0·73; I 2=72·7 %, P=0·055) and household units (SMD=0·23, 95 % CI 0·13, 0·33; I 2=3·6 %, P=0·308).
In the cholesterol analysis, the stratification by excluding the outlier( Reference Rockett, Berkey and Colditz 31 ) showed that studies using FR as reference method and low-quality studies became homogeneous (SMD=0·25, 95 % CI 0·17, 0·33; I 2=0·0 %, P=0·958 and SMD=0·27, 95 % CI 0·11, 0·43; I 2=0·0 %, P=0·857, respectively).
With respect to the vitamins, the sensitivity analysis showed that studies where the number of food items was ≥114 were homogeneous and with a low effect estimate for thiamin (SMD=−0·11, 95 % CI −0·20, 0·03; I 2=0·0 %, P=0·513). After excluding the outlier( Reference Nurul-Fadhilah, SzeTeo and Huat Foo 29 ), riboflavin showed less heterogeneity and a low SMD in studies using FR as the reference method (SMD=0·26, 95 % CI 0·12, 0·41; I 2=52·6 %, P=0·097), in FFQ having number of food items ≥114 (SMD=0·22, 95 % CI 0·13, 0·31; I 2=3·9 %, P=0·353), in FFQ being IW (SMD=0·25, 95 % CI 0·14, 0·36; I 2=36·1 %, P=0·209) and administered within the school environment (SMD=0·32, 95 % CI 0·20, 0·45; I 2=8·2 %, P=0·337). Vitamin C showed low heterogeneity and low effect in FFQ with <80 subjects (SMD=0·76, 95 % CI 0·50, 1·02; I 2=0·00 %, P=0·662). For folic acid, the school environment showed SMD of 0·27 (95 % CI 0·15, 0·39) with I 2=0·0 % and P=0·554 (Fig. 4). For vitamin A, the SA mode had SMD of 0·59 (95 % CI 0·28, 0·90; I 2=48·3 %, P=0·164).
With regard to minerals, the analysis of Fe, after excluding the outlier( Reference Arajuo, Yokoo and Pereira 22 ), showed low heterogeneity and low effect when the FR was used as the reference method (SMD=0·25, 95 % CI 0·12, 0·38; I 2=47·4 %, P=0·107). For Mg intake, different variables explained the heterogeneity, even though the SMD was always significant: the 24-HR method (SMD=0·18, 95 % CI 0·02, 0·35; I 2=18·4 %, P=0·268); number of food items <114 (SMD=0·43, 95 % CI 0·17, 0·369; I 2=58·7 %, P=0·089; Fig. 5); the previous month/week (SMD=0·32, 95 % CI 0·11, 0·52; I 2=27·1 %, P=0·241); the SA method (SMD=0·71, 95 % CI 0·44, 0·98; I 2=50·4 %, P=0·133); and number of subjects <80 (SMD=0·58, 95 % CI 0·32, 0·83; I 2=0·0 %, P=0·485).
Correlation coefficients and standardized mean differences
Finally, plotting the correlation coefficients v. the SMD for energy and nutrients (Fig. 6), a high agreement between the effect estimate derived from the two meta-analyses (correlation coefficient>0·40 and SMD<0·20) was present for protein, total fat, thiamin, niacin, vitamin B6, folic acid, Fe and Na, thus indicating that these nutrients are well assessed by the FFQ. Sugar and carotene intakes had, instead, both low correlation coefficient and high SMD (Fig. 6). All the other nutrients showed low/moderate SMD and fair correlation coefficients.
Analysis of kappa agreement, percentiles and mean agreement/limits of agreement
These data were reported as measures of FFQ validity by some of the considered studies.
With regard to kappa agreement, five studies reported the related values for macronutrients( Reference Arajuo, Yokoo and Pereira 22 , Reference Bertoli, Petroni and Pagliato 23 , Reference Hong, Dibley and Sibbritt 26 , Reference Martinez, Philippi and Estima 28 , Reference Vereecken, De Bourdeaudhuij and Maes 34 ) ranging from fair to moderate agreement (Table 4), with a mean κ w of 0·43 for energy, 0·29 for protein, 0·40 for carbohydrate, 0·29 for total fat and 0·36 for Ca. Lower κ w values were found only for PUFA (0·15)( Reference Martinez, Philippi and Estima 28 ), protein (0·16)( Reference Bertoli, Petroni and Pagliato 23 ), SFA (0·18) and vitamin C (0·17)( Reference Martinez, Philippi and Estima 28 ) intakes.
κ w, weighted kappa.
An overall good ranking ability was evidenced by data of percentiles. Eleven studies calculated the percentage of subjects’ ranking( Reference Ambrosini, de Klerk and O’Sullivan 21 – Reference Bertoli, Petroni and Pagliato 23 , Reference Deschamps, De Lauzon-Guillain and Lafay 25 – Reference Rockett, Berkey and Colditz 31 , Reference Vereecken, De Bourdeaudhuij and Maes 34 ) through the quintile, quartile or tertile method, reporting good ranges of agreement and low ranges of disagreement (Table 4).
Other studies reported good/acceptable estimates of mean agreement and LOA( Reference Bertoli, Petroni and Pagliato 23 , Reference Hong, Dibley and Sibbritt 26 , Reference Shatenstein, Amre and Jabbour 32 , Reference Watson, Collins and Sibbritt 36 ), except for retinol in the study by Hong et al.( Reference Hong, Dibley and Sibbritt 26 ) and for Ca in the study by Watson et al.( Reference Watson, Collins and Sibbritt 36 ) that showed wide LOA. Other studies( Reference Ambrosini, de Klerk and O’Sullivan 21 , Reference Arajuo, Yokoo and Pereira 22 , Reference Cullen, Watson and Zakeri 24 , Reference Lietz, Barton and Longbottom 27 , Reference Vereecken, De Bourdeaudhuij and Maes 34 ) showed, on the contrary, low values of agreement, thus stating that the examined FFQ are not able to assess the absolute intake of nutrients in adolescents.
Discussion
The present analysis showed a good overall correlation and agreement between the FFQ and the reference method in collecting data on energy and nutrient intakes in studies on adolescents. It provided information on the factors that could negatively affect the accuracy of an FFQ, namely IW administration mode, consumption interval of the previous year/6 months and high number of food items.
Moreover, the study added indications on what nutrients should be taken particularly into account when assessing their intake through an FFQ, such as sugar, carotene and K, whose intake was on average significantly overestimated by the use of FFQ.
When examining the degree of correlation, all the retrieved studies reported correlation coefficient values, and the overall correlation resulted fair/high for all nutrients considered. The heterogeneity was high for raw correlation coefficients, while it decreased in de-att/E-adj values, thus suggesting that it is important to correct from the weakening effect of measurement error and for energy intake when performing statistical analysis in these kinds of study. After exploring sources of heterogeneity, two variables were shown mainly to affect FFQ: IW administration mode and number of food items ≥114. Therefore, the SA mode could be considered a valid approach of questionnaire administration, as it is inexpensive, quick, well suited for simple questionnaires, and allows by-passing the issue of confidentiality and the engagement of human resources when administration by an interviewer is performed. Similarly, a not too long FFQ could provide accurate information on nutrient intake, as adolescents can better focus on their intake. With regard to the meta-analysis of means and standard deviations, a very small or small effect was found for over- or underestimation, thus revealing that the FFQ could be considered an accurate instrument for assessing intakes of energy and most nutrients in adolescents. Only sugar, carotene and K (and their food sources) should be taken into account when assessing their intake through an existing or a new FFQ, since despite fair/high correlation coefficients found between the FFQ and reference method, their intake was not assessed well through the examined FFQ.
The overestimation of sugar is probably due to the difficulty in the evaluation of sugar in the different foods such as soft drinks, biscuits, cakes, ice creams, chocolate, sweets or candies; overestimates can occur because the added sugars in the pyramid tip include oligosaccharides( Reference Sigman-Grant and Morita 37 ). The carotene and K overestimates are not easy to elucidate. It is likely that the overestimation could be higher with items least frequently reported. This suggests that careful consideration must be given to the measurement of their dietary sources when a new FFQ has to be developed. The main sources of carotene are carrots, dark green leafy vegetables, melons and squashes, peas, broccoli, and tree fruits such as sour cherries and apricots. The main sources of K are fruit (dried apricots, avocados and bananas), dark leafy greens, legumes such as white beans, and cereals.
Studies assessing sugar and vitamin B6 intakes were found to be quite homogeneous. However, for vitamin B6 the intake was assessed only in two studies, and since a limit of the meta-analysis is that it has low power when studies are few, this result should be handled carefully. We could also consider that this result may be related to other possibilities, i.e. that vitamin B6 is found in high concentrations in a few food items that are generally consumed in small quantities, such as Marmite or seeds.
Even though combining crude data of mean and standard deviation resulted in an initial high heterogeneity for energy and all nutrients, this variability was explained by some characteristics of the FFQ and of the study design. The strongest contributors to the heterogeneity for all the other nutrients were the IW administration mode and consumption interval of previous year/6 months, which should be carefully considered when developing and validating a new FFQ. The meta-analysis of correlation coefficients partially confirmed these findings, indicating as powerful source of heterogeneity the IW administration mode. The SMD, however, provides a clearer indication on the difference between the intakes assessed through the FFQ and the reference method, while the analysis of correlation coefficients provides only the degree of association between the FFQ and the reference method and is not appropriate to assess validity( Reference Hebert and Miller 38 , Reference Chinn 39 ). The meta-analysis approach for the comparison of means and standard deviations, thus, better allows predicting the accuracy of the examined instrument.
In the means and standard deviations analysis, the number of food items did not reveal a clear direction in influencing the validity of the FFQ. It was suggested not to reduce the length of the food list too much when developing FFQ to rank persons according to nutrient intake( Reference Molag, de Vries and Ocké 4 ), as short FFQ lack details on some food intake. On the other side, there is some evidence that overestimation increases with the length of the food list( 40 ); long and extensive FFQ may contribute to lower response rates since subjects may require long times to answer and become fatigued and frustrated, thus contrasting with the purpose of developing a fast and easy FFQ. Therefore, we think that the number of food items of a potential new FFQ should be no longer than 114 items.
One important issue when considering the validity of an FFQ is the food composition database that is used to convert foods into nutrients. Even though the influence of the use of different databases could not be evaluated by the current meta-analysis, it could be interesting to evaluate, beyond the number, also the allocation of food items, and compare them in the different FFQ. One common procedure when developing a new FFQ is that the composition database is arranged according to the way the foods are grouped. Different FFQ often gather food groups in different ways, thus leading to a variable conversion into nutrient intakes and to loss of information. An indication for future studies could be to evaluate how the foods are grouped and whether the different FFQ contain all the important food items.
The estimation of portion size is difficult for adults and children and is potentially a large source of error in dietary assessment; food models appear to be less accurate than photographs for estimating portion size( 40 ). In line with the study from Molag et al.( Reference Molag, de Vries and Ocké 4 ), the portion size estimation method was found not as affecting validity in one specific direction; then we decided that the portion size estimation method of the ASSO-FFQ will be based on photographs and on household units when necessary. However, it is important that the portion size photographs are age appropriate, in order to reduce overestimation( Reference Foster, Matthews and Nelson 41 ).
Although a high heterogeneity across studies was initially shown, information on the sources of heterogeneity was obtained from the subgroup analysis, from the sensitivity analysis and the exploration of publication bias.
Kappa agreement, the percentile method and the Bland–Altman method are suitable to assess the accuracy of a questionnaire, but not all the retrieved studies reported them as measures of their FFQ validity.
An overall fair/moderate agreement between FFQ and reference method measured by κ w was reported in five studies, this confirming that the FFQ is able to fairly assess intakes of nutrients. The ability to rank subjects according to levels of nutrient intakes is always present on FFQ, as evidenced by the values of percentiles reported in nine of the considered studies. Five out of nine studies, instead, reported a low agreement estimated through the Bland–Altman analysis; this not confirming an overall absolute validity of the examined FFQ. It should be specified that the method of Bland and Altman that includes the LOA remains the one suggested to assess the absolute validity of an FFQ, but unfortunately it is used in few studies.
A limitation of the present meta-analysis is that it is based on observational studies; therefore, many confounding factors that might affect the correlation of energy and nutrient intakes between FR and FFQ could not be controlled. Moreover, since our selection criteria excluded articles before the year 2000 and papers analysing the validity concerning specific nutrients, this could have influenced our results in different ways. For example, we could have collected more data on some nutrients such as starch, vitamin D, Cu and iodine, performing the analysis on them as well; results on vitamin B6, which were found only in two studies, could have been affected by the presence of other data on that vitamin.
Another limitation is due to the fact that we could not remove the effect of sex by conducting separate analyses for males and females, since very few studies provided data separately for males and females. Anyway, it is known that females generally better evaluate their food intake. Moreover, Galbraith plots revealed asymmetry for energy (Fig. 7) and carbohydrates, indicating the presence of publication bias. Thus, results obtained for energy and carbohydrates should be handled carefully. For all the other nutrients no publication bias was present. For some nutrients, such as starch, vitamin D, Cu, iodine and alcohol, intake was not collected in all studies, making it more difficult to draw conclusions.
There could be other limitations due to other factors that could not be analysed within the present meta-analysis, such as the adolescents’ level of understanding of the questions. All the FFQ we have analysed are specifically addressed to adolescents, so it is supposed that FFQ should be age specific, with questions easily understandable by the students. Actually, this could be tested in a small sample before administering the FFQ to the population, in order to understand whether it is suitable for the target population. Moreover, it would need evaluating whether the FFQ is culturally specific, as this could influence the accuracy and precision of the instrument.
Finally, too few studies were found for web-based FFQ and therefore we could not analyse the strength of the web-based method, even though our recent review( Reference Tabacchi, Amodio and Di Pasquale 5 ) showed that the FFQ from Matthys et al.( Reference Matthys, Pynaert and De Keyzer 42 ), the 24-HR ‘Synchronised Nutrition and Activity Program™’ (SNAP™)( Reference Moore, Ells and McLure 43 ), the 24-HR Young Adolescents’ Nutrition Assessment on Computer (YANA-C)( Reference Vereecken, Covents and Matthys 44 , Reference Vereecken, Covents and Sichert-Hellert 45 ), the Health Behaviour in School-aged Children (HBSC) FFQ( Reference Vereecken and Maes 46 ) and the Healthy Lifestyle by Nutrition in Adolescence (HELENA) FFQ( Reference Vereecken, De Bourdeaudhuij and Maes 34 ), all being web-based, could fit the purpose.
The present analysis of the combination of different studies on FFQ developed worldwide confirms that FFQ are robust instruments for ranking adolescents according to energy and nutrient intake levels, even though their absolute validity has not always been demonstrated.
Specific variables that can negatively affect the validity of an FFQ in relation to energy and nutrient intakes were identified, such as the IW administration method, a high number of food items and the consumption interval requested being a long interval, and some nutrients were recognized not to be well assessed by FFQ (sugar, carotene, K), thus suggesting to the scientific community how the design and the validation of a new FFQ might be addressed.
Acknowledgements
Financial support: The work was performed within the Adolescents and Surveillance System for the Obesity prevention (ASSO) Project (code GR-2008-1140742, CUP I85J10000500001), a young researchers’ project funded by the Italian Ministry of Health. The Italian Ministry of Health had no role in the design, analysis or writing of this article. Conflict of interest: None. Authorship: G.T. performed the conception and design of the study, carried it out, analysed and interpreted the data and wrote the article. A.R.F. performed the statistical analysis and interpretation of data and contributed in drafting the article. E.A. and A.B. contributed in drafting the article and revising it critically. M.J., A.F. and C.M. revised the article critically. Ethics of human subject participation: The ethical approval was given by the ethics committee of the ‘Azienda Ospedaliera Universitaria Policlinico Paolo Giaccone’ (approval code number 9/2011).