The evaluation of a population's micronutrient consumption constitutes a true challenge for nutrition research and, in general, has received less attention than the consumption of energy and macronutrients. Food and nutrient intakes are estimated through the administration of dietary assessment methods that usually differ according to study objectives, available resources, the population under study and the design of the epidemiological study(Reference Willett, Lenart and Willett1–Reference Cameron and Van Staveren3). As such, in cross-sectional studies with the aim of evaluating food consumption and nutritional status for a given population or group of individuals, traditionally daily methods have been utilised, such as a single or multiple 24 h recalls (24HR) or food records. In contrast, epidemiological studies have mostly employed diverse variations of a Food Frequency Questionnaire (FFQ) that has been validated typically by daily methods or biochemical indicators.
Daily methods like 24HR and food records, which are less often validated than FFQ, have both advantages and disadvantages that have been amply described, and given their open-ended nature, by and large their intrinsic value has been accepted if they are administered in adequate conditions(Reference Bingham, Nelson, Margetts and Nelson2, Reference Cameron and Van Staveren3). However, their utility has been challenged due to the increasing knowledge of the problem with misreporting in these methods, which especially affect the selective underreporting of energy and certain food groups among given population groups(Reference Poslusna, Ruprich and de Vries4). On the other hand, FFQ used in epidemiological studies have been validated or compared to other methods. Hence, notable validation studies of FFQ can be found in the literature, initiated in a rigorous and thorough manner by the groups of Walter Willett at Harvard University(Reference Willett, Sampson and Stampfer5, Reference Willett, Reynolds and Cottrell-Hoehner6); and of Rohan and Potter at the University of South Australia(Reference Rohan and Potter7), among others.
Generally speaking, FFQ have usually been validated for the consumption of energy, macronutrients, fatty acids, fibre and a given micronutrient. In fact, the first validation studies of these types of questionnaires addressed only a few vitamins and minerals such as vitamin C, vitamin A, Ca and Fe(Reference Willett, Lenart and Willett1, Reference Willett, Sampson and Stampfer5–Reference Rohan and Potter7). For the most part, correlations for micronutrients were usually lower than those for macronutrients, especially for those micronutrients whose intake level depended on the consumption of a great number of foods(Reference Willett, Lenart and Willett1, Reference Willett, Sampson and Stampfer5, Reference Willett, Reynolds and Cottrell-Hoehner6). Since the initial validation studies conducted in the 1980s to the present, hundreds of these types of studies have been published and, despite the fact that consensus-based criteria for how they should be administered are lacking, in most cases, the experience of Willett & Lenart(Reference Willett, Lenart and Willett1) and the Nurse's Health Study has been followed.
Within the context of the EURopean micronutrient RECommendations Aligned Network of Excellence(Reference Ashwell, Lambert and Alles8) (EURRECA), we have conducted an in-depth review of all validation studies(Reference Henríquez-Sánchez, Sánchez-Villegas and Doreste-Alonso9–Reference Ortiz-Andrellucchi, Sánchez-Villegas and Doreste-Alonso14) with the aim of analysing the utility of distinct dietary questionnaires, and particularly FFQ, for evaluating micronutrient intake. To achieve this, systematic reviews of all published validation articles that evaluated micronutrients or n-3 fatty acids were realised so as to assess the degree of validity of these questionnaires. In general, the questionnaires under study evaluated a set of various micronutrients, and occasionally, the focus was to assess a specific micronutrient such as Ca or vitamin A. Given the EURRECA project's objective of subsequently conducting distinct meta-analyses for each of the micronutrients, it was deemed necessary to make a tool available that could be used to estimate the quality of validation studies that evaluated dietary questionnaires. To our knowledge, the development of such a quality criteria tool has not been carried out to date.
The objective of the present study was to provide a tool for evaluating the quality of studies validating FFQ and to explore its applicability in a sample of validation studies. This would allow for weighting-observed correlations with obtained quality estimations when combining a given number of studies for the evaluation of a specific micronutrient in different population groups(Reference Henríquez-Sánchez, Sánchez-Villegas and Doreste-Alonso9–Reference Ortiz-Andrellucchi, Sánchez-Villegas and Doreste-Alonso14).
Material and methods
In order to identify the most accurate method for assessing the different micronutrient and n-3 intakes in the adult population and among specific population groups, a scoring system was needed to assess the quality of the different validation studies.
Based on tasks for the present research activity on intake methods, which consisted of conducting systematic reviews of validation papers, for all reviews, the papers were classified into three different categories, according to the reference method or gold standard applied in the validation study:
(1) The reference method was another dietary assessment method evaluating short-term intake, including both 24HR, estimated and weighed records of less than 7 d.
(2) The reference method was another dietary assessment method of long-term intake, including more than 7 d of dietary collection and
(3) The reference method was a biomarker. In this case, a discussion on the selected/available biomarker and its characteristics as a recovery or concentration biomarker was needed.
The studies included in categories (1) and (2) may be called calibration and correlation rather than validation studies, since the errors of both dietary assessment methods could be correlated or might not be independent. Sources of errors tend to be replicated in both dietary assessment methods: the one being evaluated and the gold standard(Reference Willett, Lenart and Willett1, Reference Bingham, Nelson, Margetts and Nelson2). The use of biomarkers as the reference method is usually not feasible since there is a lack of markers for many micronutrients and usually they do not reflect a pure estimation of only diet.
After an initial meeting in Prague on 15–16 May 2008, where a draft list of variables to be included in the scoring system was discussed, a working group composed by the authors of the present article made decisions about the different variables and score values to be included in the tool, using a consensus-based methodology. The proposed tool was discussed at the EURRECA Integrating Meeting in Montenegro (9–13 June 2008) and also at the Steering Committee in Milano (29–30 September 2008). The final version was ready by October 2008. The variables considered were:
(1) Sample and sample size of the study, with a maximum of 1 point; 0·5 points allocated when the sample was not homogeneous for certain variables such as sex, socio-economic status, smoking and obesity, and 0·5 points when the sample size was of more than 100 individuals (fifty individuals when using biomarkers as the gold standard).
(2) Statistics to assess validity. A maximum of 3 points was allocated; 1 for comparisons between methods' means, medians or difference; from 0·5 to 1·5 according to the correlation used (crude, energy adjusted, deattenuated or intraclass); plus 0·5 when statistics to assess agreement or misclassification were utilised.
(3) Data collection. 1 point if data were gathered by personal interview.
(4) Seasonality. Only when considered in the validation design. Addition of 0·5 points.
(5) Supplements included and validated. Addition of 1·5 points when the validation study considered supplements.
Scores could range from 0 (poorest quality) to a maximum of 7 (highest quality). This allowed for the classification of validation studies according to their methodological quality (Table 1 step 1):
(1) Very good/excellent, score ≥ 5·0.
(2) Good, 3·5 ≤ score < 5.
(3) Acceptable/reasonable, 2·5 ≤ score < 3·5.
(4) Poor, score < 2·5.
For the simultaneous analysis of several validation papers with the aim of estimating a mean combined correlation of various studies per micronutrient for a given diet assessment method, the correlation coefficient value of each study was multiplied by its quality score. Then, the sum of the weighted correlations was divided by the sum of the validation studies' quality scores. This provided us with a correlation coefficient adjusted for the study's methodological quality (see Table 1 step 2).
SES, Socioeconomic status.
In order to analyse the implications of using the scoring system, a systematic review of the validation studies addressing the intake of at least one vitamin was performed, and the average of correlations was calculated without (crude) and with (adjusted) the scoring system. As explained in the paper from Henríquez-Sánchez et al. (Reference Henríquez-Sánchez, Sánchez-Villegas and Doreste-Alonso9), the following number of studies addressing each vitamin was included: 76 articles for vitamin A, 108 for vitamin C, 21 for vitamin D, 75 for vitamin E, 47 for folic acid, 19 for vitamin B12, 21 for vitamin B6, 49 for thiamine, 49 for riboflavin and 30 for niacin, extracted from a total of 124 different studies. The correlation values obtained for every vitamin in each of the studies were summed up, first without consideration of the quality of the study or scoring system, and second taking this factor into account. Subsequently, the proportion of correlations for vitamins A, C, D and E were classified into four categories: very good ( ≥ 0·7), good (0·5–0·69), acceptable (0·3–0·49) and poor ( < 0·3; Table 1 step 3), and comparisons between crude and adjusted figures were made using a McNemar χ2 test in SPSS-PC(Reference Bland15).
Results
Table 1 presents the scoring system administered to evaluate the quality of the validation studies. Among the 124 validation studies analysed, the quality scores ranged from 0·5 to 6 (highest score obtained by Sudha et al. (Reference Sudha, Radhika and Sathya16)). The average was 3·19 and median 3·0 (Table 2). Table 2 shows the classification of the validation studies, with 47·5 % of the studies rating as good (41·9 %) or very good (5·6 %) quality, and 16·9 % having a poor quality rating. Table 3 shows the mean and classification of crude and quality-adjusted correlation coefficients obtained for vitamins A, C, D and E. Although no significant differences were found, correlation averages were similar, and the percentages of correlations below 0·3 (poor) were considerably higher after adjusting for the quality of the validation study. Additionally and in contrast, for vitamins A and D, the percentages of correlations ≥ 0·7 (very good) were substantially higher after adjustment for quality.
24HR, 24 h recall.
Discussion
It is quite surprising that to date a systematic review has not been conducted, which evaluates validation studies of FFQ. Over the last few decades, nutritional epidemiology has developed and increased in an exponential fashion, due in part to the innovation of simplified, properly validated questionnaires for evaluating the consumption of food and nutrients. The development of these types of instruments accommodates the inherent needs of epidemiological studies in which larger population samples make it a necessity to utilise data collection methods that are valid, accessible, reproducible and less costly. In general, the validation of FFQ has been achieved by comparing with various records/registers, which adequately distributed over a long period of time, typically 1 year, reflect usual intake. Multiple 24HR and biomarkers have also been utilised.
When developing the quality scoring system, five principal variables were taken into account: sample and sample size, statistics, data collection, seasonality and supplements. Sample size is an aspect that has been given minimal consideration in validation studies. According to Willett & Lenart(Reference Willett, Lenart and Willett1), selecting an appropriate number of subjects for a validation study is less than straightforward as correlations for many nutrients are likely to be examined and the precision required remains arbitrary. These authors have estimated that a reasonable sample size for a validation study should comprise between 100 and 200 people. We chose the sample of 100 individuals as a study quality criterion when using other dietary assessment methods as a gold standard; when using biochemical markers as the reference methods, a sample size of just fifty individuals could be considered satisfactory, since measurement errors in biomarkers are essentially uncorrelated with errors in any dietary assessment method(Reference Willett, Lenart and Willett1). We added 0·5 when the sample size was of more than 100 people and 0·5 when the sample was not homogeneous. Validating a questionnaire in a homogeneous sample (i.e. very healthy people) could reduce its external validity and usefulness when administering it in other samples (for example, in obese or low-income populations, among others). Other possibilities such as weighting the study for the inverse of the variance could also have been considered, instead of using a specific sample size to assess quality.
Regarding the statistical methods used in epidemiological studies, although it is useful to compare means or medians for the two methods, it is more important to provide data on the associations between the intakes measured by the two methods(Reference Willett, Lenart and Willett1). Correlations are the most applied statistical procedures that should be presented, and to the extent possible, be adjusted for different variables, particularly energy, age and sex. Deattenuating the correlation is critical to reduce its dependency on between-person variation. The correlation between the questionnaire measurement and the subjects' true long-term intake of micronutrients is then estimated from the observed correlation with mean reference measurements, with correction for attenuating effects due to random errors in the reference measurements themselves(Reference Kaaks17). For this reason, deattenuated correlations were assigned a higher rating than crude or adjusted correlations in the scoring system. Using biochemical markers may solve the problems of correlation errors between two dietary questionnaires, but only for those micronutrients where a recovery marker is available(Reference Kaaks, Ferrari and Ciampi18). However, as only a limited number of recovery markers are available, adjustment for energy intake could be an alternative, since it may diminish the problem of correlated random errors. For this reason, energy adjustment and the deattenuating of the correlation are critical points in the statistical methods used in the validation studies of dietary intake. Bland & Altman(Reference Bland and Altman19, Reference Bland and Altman20) recommended not to use the correlation analysis in these types of studies, but rather to analyse the standard deviation of the difference between the two methods, since this is not influenced by between-person variation. Additionally, classification within the same tertiles of consumption may be useful as well. Statistical procedures account for 3 out of 7 points of the quality score as they are critical to a sound analysis. Moreover, since there is no single method to relate a surrogate measure to a measure of truth that conveys all the available information, it is probably best to analyse and present the data in several ways(Reference Ambrosini, de Klerk and Musk21, Reference Negri, Franceschi and La Vecchia22).
As for data collection procedures, FFQ and food records can be administered by personal interview or by telephone or can be self-administered, using electronic means or postal mailings. In general, the accuracy and quality of data collection are greater when using personal interviews, particularly in cross-sectional studies(Reference Serra-Majem, Ribas-Barba, Aranceta-Bartrina, Serra-Majem and Aranceta-Bartrina23). In addition, self-administered questionnaires are generally less reliable in low income populations and those with less education(Reference Ngo, Gurinovic and Frost-Andersen24, Reference Thompson, Subar, Coulston and Boushey25). Taking the afore-mentioned into account, in our opinion, collecting dietary information for the gold standard or reference method by personal interview, independently if the validated questionnaire is self-administered or not, may provide better comparisons with less possibility of correlating errors among the two estimated methods. A score of 1 was included for those validation studies using information gathered by face to face interview. The qualification of the person administering the interview was a variable not taken into specific consideration for scoring, but is also a factor that impacts on quality data collection i.e. previous training, being a dietitian, or bilingual or cross-cultural interviewer(Reference Kohlmeier26).
Seasonality is an important issue in nutrition epidemiology, particularly as it is directly related to the intake of certain vitamins. For this reason, it is important to take it into consideration when validating a questionnaire; usually, authors tend to distribute the food records over different time periods throughout the year(Reference Cade, Burley and Warm27). As such, we have included seasonal variation as a quality criterion, with the addition of 0·5 points for studies taking this factor into account.
Finally, the inclusion of supplements was considered an important characteristic in validation studies for micronutrients, unless the questionnaire was intentionally designated for non-supplement users only. With the exception of certain vitamins from the B complex group, for the majority of micronutrients, validation study correlations improved when supplement intake was taken into consideration(Reference Willett, Sampson and Browne28), which seems evident for those populations whose intake of supplements is qualitatively and quantitatively important. In fact, in a systematic review on vitamins(Reference Henríquez-Sánchez, Sánchez-Villegas and Doreste-Alonso9), correlations were higher when supplements were taken into account in the validation studies, especially for vitamins D, E, B6 and folic acid, but less for vitamin B12 or vitamin A.
Some authors have developed scoring systems for evaluating dietary methodology (quality) in epidemiological studies(Reference Dennis, Snetselaar and Nothwehr29), and the quality criteria included, among others, the number of FFQ items, the form of administration and the application of feasibility testing or pre-testing. It is often difficult to come across these quality parameters in epidemiological studies, and particularly so in the abstract of the article. Nevertheless, our objective in the present analysis was not so much to judge the quality of nutrition information in epidemiological studies applying a priori defined criteria, but rather to assess the quality of validation and calibration studies of FFQ, with the aim of including, excluding or weighting the study or studies that utilise a given questionnaire in possible reviews or meta-analyses for a specific micronutrient. To our knowledge, this factor has not been taken into consideration in systematic reviews or meta-analyses of nutrition epidemiological studies before.
In the present study, we found that less than half of FFQ validation studies, evaluating any given vitamin, were of ‘good’ or ‘very good’ quality and that 17 % were of ‘poor’ quality. Quality scores were lower in those validation studies where FFQ was compared to dietary records reflecting short-term intake. When applying the quality score to adjust the correlation obtained in these validation studies for vitamins, the mean correlations were nearly the same, but the percentage of studies with correlations falling to the lowest category (poor) increased for all vitamins included in this analysis. Moreover, after adjusting for the study quality score, an increase in the percentage of studies whose correlations shifted to the highest category (very good) was seen for vitamins A and D. This information should be considered when interpreting results from studies using FFQ to evaluate vitamin intake.
The impact of measurement errors in dietary assessment instruments in the design, analysis and interpretation of nutrition studies may be much greater than previously estimated, particularly in prospective cohort studies(Reference Kipnis, Midthune and Freedman30). Recently, discussions about FFQ limitations compared to food records or recalls have been pointed out, particularly due to results emerging from the EPIC and other studies(Reference Bingham, Gill and Welch31–Reference Kipnis, Midthune and Freedman38). Without a doubt, this has led to the erroneous utilisation of some FFQ, not necessarily due to the nature of the instrument in and of itself, but rather to its inappropriate application (i.e. FFQ validated in a population or for dietary components different from that in the epidemiological study); albeit, a considerable percentage of these tools may also be of low-quality design. As such, and especially when considering micronutrients, it is critical to utilise high-quality instruments in which we have obtained an adequate estimation of intake for a given micronutrient (drawn from the correlation coefficient of the validation study). The issue should not be limited to the validation of a questionnaire, but rather it should address the utility of the instrument for measuring intakes of a given micronutrient and whether the design and analysis are adequate(Reference Willett39).
As pointed out by Block & Hartman(Reference Block and Hartman40), negative outcomes may often be the result of poor instruments and faulty data impede the progress of research, which may all be particularly relevant when studying micronutrients. According to those authors, the factors that may affect the validity of a diet questionnaire are: (1) respondent characteristics, (2) questionnaire design and quantification, (3) adequacy of the reference data and (4) quality control of data management.
Judging the quality of dietary exposure assessment in epidemiological studies is a crucial element. The scoring system proposed in the present paper or other tools that may follow can contribute to increasing the quality of the evidence in nutrition research, due to its capacity to serve as guidance for validation studies of diet questionnaires as well as for its utility in the selection and weighting of results from already conducted nutritional epidemiological studies.
Acknowledgements
The studies reported herein have been carried out within the EURRECA Network of Excellence (www.eurreca.org), financially supported by the Commission of the European Communities, specific Research, Technology and Development Programme Quality of Life and Management of Living Resources, within the Sixth Framework Programme, contract no. 036196. This report does not necessarily reflect the Commission's views or its future policy in this area. L. S-.M. developed the initial validation instrument, coordinated the discussion and consensus, designed the study and wrote the paper. L. F. A. took part in the discussions of the initial validation instrument and contributed to the discussion, consensus and manuscript content. P. H. conducted some statistical analysis and contributed to the discussion and consensus. J. D.-A. conducted some analysis, contributed to the discussion, consensus and manuscript content. A. S.-V. contributed to the discussion, consensus and manuscript content. A. O.-A. contributed to the discussion, consensus and manuscript content. E. N. contributed to the discussion, consensus and manuscript content. C. L. V. contributed to the discussion, consensus and manuscript content. The authors have no conflict of interests to report.