Accumulated evidence suggests that nutrient intakes are associated with the development of many diseases, especially chronic conditions( Reference Cho, Qi and Fahey 1 , Reference Rice, Quann and Miller 2 ). In order to investigate this relationship, it is important to accurately assess food and nutrient intakes. The FFQ is a useful tool for the estimation of food and nutrient consumption and has been widely used in investigating the association between diet and chronic diseases in most epidemiological studies( Reference Willett, Sampson and Stampfer 3 , Reference Pietinen, Hartman and Haapa 4 ). FFQ are easy to administer and relatively inexpensive to use in large population studies. However, the performance of an FFQ is very sensitive to the ethnic, social and cultural backgrounds of the study population( Reference Sharma, Cade and Jackson 5 ). For this reason, the reliability and the validity of an FFQ need to be evaluated for studies conducted in a new population.
Reliability refers to the consistency of measurements on repetition, whereas validity refers to the ability to measure what the FFQ was designed to measure. At present, there is no perfect standard for the validation of dietary intake and a superior measurement is always used for comparison; 24-h dietary recalls (24-HDR) may be superior to FFQ and have been frequently used as the reference method in many Chinese validation studies( Reference Shu, Yang and Jin 6 – Reference Xia, Sun and Zhang 9 ). A critical review regarding validation of FFQ has shown that FFQ are validated against another dietary method in 75 % of studies. When co-operation or literacy of the study subjects is limited, 24-h recalls may be more appropriate( Reference Cade, Thompson and Burley 10 ).
As a developed economic region in China, Nanjing is experiencing a high prevalence of chronic non-communicable diseases such as hypertension, diabetes and cancer among its population. A large community-based cross-sectional nutrition and health survey has been initiated in 2014 in the Nanjing area. The main purposes of this study were to collect information on lifestyle factors including dietary habits and to observe their effects on the occurrence of chronic diseases. We developed a new FFQ to estimate the nutrient and food group intakes of this population. For validation of this FFQ, we gathered information using three consecutive 24-HDR as a reference method during each of the four seasons over a period of 1 year, totalling up to 12 d for 203 participants. In this study, we report the reliability and validity of this FFQ for use in a large community-based nutrition and health survey.
Methods
Study population
The subjects were recruited using a multi-stage stratified random sampling method (Fig. 1). First, we randomly selected two districts (one urban and one suburban). Next, three streets/towns from each chosen district were randomly selected. Finally, one community from each chosen street/town was randomly selected. This resulted in a total number of six communities. We then randomly selected 250 eligible residents from the six communities to participate in the validation study. The inclusion criteria were as follows: local resident for at least 5 years, aged between 30 and 80 years, free of serious diseases requiring a special diet and not on a weight-reduction diet. Among the 250 selected residents, 248 were eligible to participate and 223 agreed to take part in the study and completed the survey (response rate=89·9 %). The main reasons for not participating in the study included refusal, absence during the investigation period and poor health.
Ethics approval for this study was obtained from the academic and ethical committee of Nanjing Municipal Center for Disease Control and Prevention (Nanjing CDC). All the participants provided their written informed consent before the survey.
Study design
Each participant completed the same FFQ twice (FFQ1: the first FFQ and FFQ2: the second FFQ), 1 year apart. Four three consecutive 24-HDR were collected at 3-month intervals during the period of 1 year. The first three consecutive 24-HDR was obtained 1 month after the administration of FFQ1 (in June 2014), and the last three consecutive 24-HDR was obtained 1 month before the administration of FFQ2 (in May 2015). Participants who did not satisfactorily complete the two FFQ or all three consecutive 24-HDR (n 15) and those who had extreme values for total energy intake (<2092 kJ/d (500 kcal/d) or >20 920 kJ/d (5000 kcal/d), n 5)( Reference Watson, Collins and Sibbritt 11 ) were excluded from this study. Thus, a total of 203 subjects (81·9 %) were included in the final analysis.
FFQ and 24-h dietary recalls
The FFQ included eighty-seven food items and ten food categories (cereal; red meat (pork, beef, mutton); poultry; fish and shrimp; eggs; dairy products; soya-based foods; vegetables; fruits; snacks/desserts), which covered about 90 % of the commonly consumed foods in Nanjing. For each food item, participants were asked to recall the frequency of consumption (daily, weekly, monthly, annually or never) and the amount of consumption using a common unit of weight in China (1 liang=50 g) over the past 12 months. Individual consumption of food items was converted to grams per day in the further analysis. According to the similarity of nutrient profiles and culinary usage among the foods and the grouping scheme used in other studies, we aggregated the eighty-seven food items into thirty pre-defined food groups( Reference Shim, Oh and Kim 12 , Reference Qin, Melse-Boonstra and Yuan 13 ). For seasonal vegetables and fruits, participants were asked to recall how often they ate these foods during the season. The Chinese Food Composition Table ( Reference Yang 14 ) were used to estimate the daily energy intake (kJ/d (kcal/d)) and major nutrients of study participants.
The three consecutive 24-HDR were administered for 2 weekdays and 1 weekend day in a usual week. Each participant was asked to provide the name and amount of all foods consumed during the previous 24 h. If the previous day was a special day owing to feasts or travels, food consumption data of the day before the 24 h were recorded or another day was chosen to interview the participant by telephone. The amounts of different food items that were mixed in one dish were recorded respectively. The recalled food items were assigned to the corresponding food groups as defined by the FFQ. We calculated daily mean intakes of energy (kJ/d (kcal/d)), thirteen nutrients and ten food groups estimated by four three consecutive 24-HDR. The mean 24-HDR (m24-HDR) data were used as the standard to measure the relative validity of the FFQ.
Trained interviewers from the local CDC administered the two FFQ and the four three consecutive 24-HDR by face-to-face interviews. All diet information was collected and checked after completion. Any implausible or ambiguous information was further verified and obtained from the participants. Each participant had the same interviewer during the study period.
Statistical analysis
Medians and inter-quartile ranges for energy, nutrients and food intake were calculated for both FFQ and 24-HDR. The reproducibility was estimated using the Wilcoxon’s signed-rank test, Spearman’s correlation, intra-class correlation coefficient (ICC), weighted κ statistic and misclassification (quartiles method) analyses to compare the intakes from FFQ1 and FFQ2( Reference Huybrechts, De Backer and De Bacquer 15 ). The validity of the FFQ relative to the 24-HDR was assessed by Wilcoxon’s signed-rank test, Spearman’s correlation, weighted κ statistic and misclassification (quartiles method) analyses. The residual method was used to exclude the possibility of variation due to energy intake( Reference Willett, Howe and Kushi 16 ). To correct for within-person error in the measurement of the HDR, the observed correlation was multiplied by the de-attenuation factor $$(1{\plus}\gamma /n)^{\raise.5ex\hbox{$\scriptstyle 1$}\kern-.1em/ \kern-.15em\lower.25ex\hbox{$\scriptstyle 2$} } $$ , where γ is the ratio of the within- and between-person variances and n is the number of 24-HDR (here n 12).
Bland–Altman plots were used to examine the agreement between two dietary assessment methods across a range of intakes. The difference between the two methods was plotted against the average of the two methods. The mean difference and the 95 % limits of agreement (LOA), calculated as mean difference ±1·96 (sd of differences), were used to summarise agreement at the population level. Natural-log(ln) transformations were performed in order to narrow the LOA, as recommended by Bland & Altman( Reference Bland and Altman 17 ).
All the statistical analyses were performed using SPSS (version 20.0) and MedCalc (version 11.4). A P value<0·05 was considered to be statistically significant.
Results
Of the 203 participants eligible for the present study, 48·8 % were males. The mean age was 50·4 (sd 12·2) years (range 31–80 years); the mean BMI was 23·1 (sd 2·8) kg/m2; and 79·5 % had education of junior high school or below. The proportions of current smokers and drinkers were 22·0 and 28·8 %, respectively (Table 1).
* Data were from FFQ1.
Table 2 presents the median intakes of total energy, nutrients and food groups derived from the FFQ, m24-HDR, the percentage of differences and the results from the Wilcoxon’s signed-rank test. The Wilcoxon’s signed-rank test showed that the intakes of almost all nutrients and food groups assessed by the two FFQ were not significantly different, except for fat (P=0·035), retinol (P=0·010) and vitamin B1 (P=0·020). The median intakes for all nutrients and food groups assessed using FFQ2 were higher or equal to the values obtained using FFQ1, with differences in median intakes between 0 and −25·4 %. There was also no significant difference between intakes of nutrients and food groups assessed by FFQ2 and m24-HDR (both assessments cover the same time period), with the exception of fibre (P<0·001), retinol (P=0·014), carotene (P=0·005), eggs (P=0·022) and soya-based foods (P<0·001). Compared with the m24-HDR, the FFQ tended to underestimate intakes of most nutrients and food groups.
The crude- and energy-adjusted correlation coefficients for FFQ1 and FFQ2 are presented in Table 3. For total energy, nutrients and food groups, crude Spearman’s correlation coefficients ranged from 0·66 for retinol and eggs to 0·88 for vitamin C, and the crude ICC ranged from 0·65 for fat to 0·87 for vitamin C. After adjusting for energy, most correlation coefficients decreased. The energy-adjusted Spearman’s correlation coefficients ranged from 0·41 (poultry) to 0·83 (carbohydrate), whereas the energy-adjusted ICC ranged from 0·45 (fruits) to 0·80 (soya-based foods). The crude, energy-adjusted and de-attenuated Spearman’s correlation coefficients between the FFQ and the m24-HDR are presented in Table 3. These values enable the assessment of the relative validity of the FFQ. The crude Spearman’s correlation coefficients for FFQ2 and the m24-HDR ranged from 0·21 (soya-based foods) to 0·69 (fat, retinol). The energy-adjusted coefficients ranged from 0·19 (soya-based foods) to 0·58 (fat, vitamin C), whereas the de-attenuated coefficients ranged from 0·25 (soya-based foods) to 0·71 (fat).
ICC, intra-class correlation coefficients.
When the intakes were categorised into quartiles, the ranges of agreement rates for the same or adjacent quartile classifications were 90·8–100 %, when derived from the two FFQ, and 71·8–89·3 %, when derived from FFQ2 and the m24-HDR. Extreme misclassification into opposite quartiles was <10·0 % for all nutrients and food groups, with the exception of soya-based foods (10·2 %). The weighted κ statistic showed moderate conformity, ranging from 0·57 to 0·80 for the two FFQ and from 0·16 to 0·53 for FFQ2 and the m24-HDR (Table 4). No difference was observed in the reproducibility and validity between men and women.
The Bland–Altman plots for total energy intake, protein, fat and carbohydrates are presented in Fig. 2–5. Anti-logging rendered mean difference and LOA of 96·1 % (95 % CI 44·5, 209·5), 99·0 % (95 % CI 32·0, 309·6), 94·2 % (95 % CI 39·5, 224·0) and 97·7 % (95 % CI 42·7, 222·5), respectively, for energy intake, protein, fat and carbohydrates. For almost all nutrients, <10 % of the subjects were outside the LOA.
Discussion
In this study, we examined the reproducibility and validity of an FFQ used for the Nanjing cross-sectional nutrition and health survey. According to a previous review( Reference Cade, Thompson and Burley 10 ), the number of food items listed in FFQ should range from 5 to 350. The FFQ used in this study was composed of eighty-seven food items, which covered about 90 % of the commonly consumed foods in Nanjing. We evaluated the performance of the FFQ by comparing intake of nutrients and selected food groups obtained from this instrument with those derived from the 24-HDR.
In the present study, the median intakes for almost all nutrients and food groups obtained from FFQ2 were higher than or equal to the values obtained from FFQ1. This might be due to the learning effect. That is to say participants might be more mindful of what they ate, and thus estimated the amount more appropriately after the previous surveys. Spearman’s correlation coefficients for reproducibility in our study ranged from 0·66 to 0·86 for food groups and from 0·66 to 0·88 for nutrients. The ICC between FFQ1 and FFQ2 were 0·66–0·86 for food groups and 0·65–0·88 for nutrients. Compared with other studies( Reference Villegas, Yang and Liu 7 – Reference Xia, Sun and Zhang 9 , Reference Ibiebele, Parekh and Mallitt 18 – Reference Pietinen, Hartman and Haapa 23 ), in which the correlation coefficients generally ranged from 0·20 to 0·80, the correlation coefficients in the present study were slightly higher. This may reflect the fact that the dietary habits of individuals in Nanjing are relatively stable. Energy adjustment did not improve the correlations for nutrients and food items. If the variability of nutrient consumption is related to energy intake, energy adjustment may increase correlation coefficients.
Various time intervals between FFQ1 and FFQ2, from 15 d to several years, have been reported in previous studies( Reference Shu, Yang and Jin 6 , Reference Vereecken and Maes 24 ). Reproducibility tests are based on the assumption that diet does not change between two questionnaires; thus, reproducibility may ideally be obtained by two administered questionnaire surveys with a short interval( Reference Hakim, Hartz and Harris 25 – Reference Gulliford, Mahabir and Rocke 28 ). However, subjects are more likely to remember and repeat their responses in that case. In this study, we administered FFQ1 and FFQ2 with an almost 1-year interval, which can reduce the above-mentioned error. As categorised dietary intake rather than the absolute amount of intake has been more commonly used in epidemiological studies of diet and chronic diseases, we performed misclassification analyses, showing that the percentages of participants correctly classified into the same or adjacent categories and the weighted κ values in our study were higher than those reported by other validation studies( Reference Zhuang, Yuan and Lin 8 , Reference Xia, Sun and Zhang 9 ). In the Suihua female adolescent study, the agreement rates for classifying nutrient and food group intakes into the same or adjacent categories were 70·8–92·9 %, and the weighted κ values were 0·35–0·60( Reference Xia, Sun and Zhang 9 ). In another study, the rates ranged from 73·0 to 86·0 % for the agreement and from 0·20 to 0·50 for the weighted κ values( Reference Zhuang, Yuan and Lin 8 ).
The results of estimated relative validity depend on several factors such as choice of reference method, degree of homogeneity of the intake values within the population, recall period and the number of days of record collection( Reference Block, Woods and Potosky 29 ). Selecting the appropriate reference method by which to assess the test measurement is a major part of the validation process. In a review on the validation of FFQ, the authors showed that 75 % of studies validated the FFQ against another dietary method. When co-operation or literacy of study subjects is limited, 24-h recalls may be more appropriate( Reference Cade, Thompson and Burley 10 ). Moreover, it is important that the measurement errors of the FFQ and reference method should be independent. In the present study, relative validity has been tested by comparing FFQ2 with the average of the four three consecutive 24-HDR, one for each season. The multiple recalls were able to minimise the effect of daily and seasonal dietary intake variation on the dietary assessment. Compared with the 24-HDR, the FFQ tends to underestimate intake of most nutrients. The mean differences shown in Bland–Altman plots were all negative. We also found that the FFQ underestimated the intakes of red meat, poultry, egg and dairy products. These results are very similar to the Shanghai men’s study( Reference Villegas, Yang and Liu 7 ). Some of the measurement error may reflect a bias of study participants seeking social approval( Reference Hebert, Clemow and Pbert 30 ).
The correlation coefficients of our study were consistent with the results reported in other Chinese population studies, which ranged from 0·15 to 0·72( Reference Shu, Yang and Jin 6 – Reference Xia, Sun and Zhang 9 ). The validity correlation decreased for most of the nutrients and food groups after adjustments for energy. This might be due to the between-person variation in the intakes of nutrients and food groups in subjects. There was very little change in the correlations of nutrient and food intakes assessed by the FFQ and averaged 24-HDR after we further adjusted for the within-person variation from the multiple 24-HDR. If the frequency of consumption is low and the within-person variability is too high, the correlation coefficients can be attenuated( Reference Salvini, Hunter and Sampson 31 ). Our results indicate that the dietary intake of individuals in Nanjing did not vary significantly during the four seasons; this may be because Nanjing is situated in the Yangtze River Delta, with easy access and a well-developed infrastructure for the movement of goods, so that the same foods can be purchased from the market throughout the year. The misclassification analyses showed that more than 70 % of the subjects were classified into the same or adjacent quartile for food group and nutrient intakes by both methods, which compares well with other studies( Reference Shu, Yang and Jin 6 – Reference Xia, Sun and Zhang 9 , Reference Haftenberger, Heuer and Heidemann 32 , Reference Deschamps, Lauzon-Guillain and de, Lafay 33 ). The weighted κ value for most nutrients and food groups reached the ‘acceptable’ threshold.
Natural-log (ln) transformations were performed by Bland–Altman analysis in order to narrow the LOA. A wide LOA indicates that the potential for large differences between methods and agreement is considered poor. The LOA of our study were wider than those reported in two previous studies( Reference Villegas, Yang and Liu 7 , Reference Zhuang, Yuan and Lin 8 ); however, the mean differences of nutrient intakes were approximately 0. This may indicate that the FFQ is more suitable for ranking intakes than estimating absolute intakes, and the misclassification in nutrient intake is less likely to cause systematic biases.
There are several limitations to our study. First, due to the lack of a perfect standard for measuring dietary intake to assess the validity of a dietary instrument, we chose dietary recalls as the reference method. This method was advantageous in its ability to collect actual intake on specific days, but also has some weak points such as under-reporting in dietary recalls. We attempted to minimise under-reporting by checking dietary recalls by following-up incomplete or ambiguous information directly with respondents. Second, the analysis of reproducibility and validity was confined to those adults aged 31–80 years. It is unclear whether our findings can be generalised to children, adolescents and the younger adult populations. Finally, the data would be more representative if three consecutive 24-HDR were collected monthly, instead of each season.
In summary, our study evaluated the reproducibility and validity of an eighty-seven-item FFQ developed specifically for investigating the relationship between dietary factors and chronic diseases in the Nanjing community-based cross-sectional study. The results of this study indicated that the FFQ can reasonably categorise usual intake of major nutrients and food groups among the study population, although it may not quantify the absolute intake of some nutrients or foods.
Acknowledgements
The authors are grateful to all the dedicated fieldworkers who took part in the survey, and all the participants who facilitated the survey implementation at each community.
The present study was supported by Nanjing Municipal Medical Science and Technique Development Foundation, China (2012-YKK12166).
Q. Y. and F. X. contributed to the study design, data analysis and manuscript writing; Q. Y., X. H., Z. W., H. Y., X. C., H. Z., C. W., Y. L. and L. S. were responsible for data collection; X. H., Z. W., H. Y., X. C., H. Z., C. W., Y. L. and L. S. were responsible for manuscript revision.
The authors report no conflicts of interest.