The use of vitamin and mineral supplements (ViMiS; referred to as ‘supplements’) in the UK has risen to 40 % for women and 29 % for men according to national surveys held in 2000–2001(1). These results show that supplements can increase the mean intake of vitamins by 4–183 % and of minerals by 0–16 %, depending on the type of vitamin and sex and age of the participant.
Previous studies indicate that supplement consumption should be taken into account when assessing dietary nutrient intake; otherwise misclassification of nutrient intakes of individuals and unclear relationships with biomarkers can occur(Reference Block, Sinha and Gridley2). Although several UK-based cohort studies have compared supplement users (SU) with non-supplement users (NSU) with regard to sociodemographic factors, morbidity and nutrient and food intakes, they have not been able to estimate total nutrient exposure(Reference Harrison, Holt and Pattison3–Reference McNaughton, Mishra and Paul5), i.e. the combined nutrient exposure from foods and supplements.
In order to analyse total nutrient exposure for the European Prospective Investigation into Cancer and Nutrition in Norfolk (EPIC-Norfolk), an equivalent to the UK National Food Composition Database (UK FCD) for supplement composition was needed. As laboratory analytical data were not available for supplements on the UK market in the early 1990s, nor to date, it was necessary to design a database structure and calculation method to calculate the nutrients from supplements by using the information given on manufacturers’ labels. All this collected information enables to identify the heterogeneity among SU; for instance, there are many differences in nutrient dosages between manufacturers of supplements. Likewise, a dose of vitamin C in a single vitamin C supplement can easily contribute 500–1000 mg/tablet, whereas in a multivitamin, vitamin C contributes around 60 mg/tablet.
However, label-based information differs from the UK FCD in several ways and hence the ViMiS database was developed to overcome these issues. The present study explains the rationale used in the development of the ViMiS database, as well as in the data-entry system and nutrient calculations, including the system of generics that was developed to deal with missing data. We also provide results for supplement use in the cohort.
Methods
The ViMiS database was set up in 1994 as part of the EPIC-Norfolk study, a prospective study investigating the relationship between diet and chronic disease that started in 1993(Reference Bingham, Welch and McTaggart6). Participants between 40 and 79 years, living in the Norfolk area of East Anglia, were recruited from age–sex registers of general practitioners. Of the 30 452 participants providing informed consent, 25 630 took part in a health check-up. During the health check-up, anthropometric measurements were taken and blood and urine samples were collected. A 7 d diet diary (7dDD), consisting of the first day as a 24 h recall, was completed by 23 656 participants (92 % response), and these data, including fortified foods, were entered using the Data Into Nutrients for Epidemiological Research (DINER) program(Reference Welch, McTaggart and Mulligan7). The supplements were entered in a separate data-entry program and defined according to the directive of the Council of the European Parliament, where a food supplement means ‘foodstuffs the purpose of which is to supplement the normal diet and which are concentrated sources of nutrients or other substances with a nutritional or physiological effect, alone or in combination, marketed in dose form…’(8). For the purpose of being able to estimate total nutrient intake, the database also included prescribed and over-the-counter (OTC) medication that contained vitamins and/or minerals such as ferrous sulfate and calcium carbonate.
SU were defined as anyone who reported taking a supplement or drug, as described above, at any time on at least one occasion, during the period in which they recorded their 7 d food intake.
Development of the vitamin and mineral supplement database
A relational database for supplements was developed, which connected all the elements related to the supplement (e.g. manufacturers’ data, ingredients, nutrients) with the participants who used them (e.g. amount and frequency consumed). For this purpose, the following applications were used: Oracle© (data entry and storage; Oracle Corporation, Redwood Shores, CA, USA), the SAS statistical software package version 9·0 (data calculation; SAS Institute, Cary, NC, USA).
This project consisted of, first, gathering and interpreting supplements that were available on the UK market between 1993 and 1998, covering detailed information on brands and dosages; second, development of conversion factors, since label-based information differs from the UK FCD in several ways: (i) units of measurement used, (ii) supplements contain complex ingredients such as oils and plant ingredients and (iii) supplements contain ingredients that contain (multiple) singular nutrients, e.g. calcium phosphate providing calcium and phosphorus; third, development of a data-entry system, suitable for entering participants’ data that were reported at different levels of accuracy; fourth, development of a calculation method for the nutrients derived from supplements; and finally, development of a system to manage missing data in the participants’ diaries (see Fig. 1).
Vitamin and mineral supplement database: collection of manufacturers' data
The data for the ViMiS database were collected from 1992 onwards by using packaging material received from participants and major manufacturers in the UK. The manufacturers were asked by letter to give the ingredient list and nutrient declaration as described on the packaging of their supplements. Information sheets or packaging material of the most frequently consumed brands was received. No biochemical analysis of supplements was undertaken. The data received were checked and corrected and then entered per unit of consumption, e.g. per capsule/tablet or per 5 ml for liquids.
Within the database, supplements were grouped according to the main nutrient(s) they provided (Appendix 1), in order to structure the large amount of data and improve efficiency for subsequent analysis. Every supplement was given a supplement ID created by using the following five characteristics (Fig. 2).
1. Name of the supplement: an inventory of all the supplement names was made after which they were alphabetically sorted and sequentially numbered to create a directory.
2. Brand/manufacturer: a similar procedure to 1.
3. Strength value (dose of one capsule, tablet, etc.; e.g. 500).
4. Unit of measurement (of strength, e.g. mg).
5. Type (capsule, tablet, spoon, etc.).
By concatenating these five characteristic codes, preceded by the supplement group, a unique supplement ID was created. Any new supplement or manufacturer’s name could easily be added to the directories (see Fig. 2 for extracts of directories).
Vitamin and mineral supplement database: developing a database of compounds with their conversion factors
All ‘compound ingredients’ on the labels of supplements were coded, ranging from vitamins, minerals, oils and fatty acids to excipients, such as flavourings, antioxidants and tabletting agents. The compounds were coded according to the compound (source) and ‘nutrient end products’, e.g. calcium ascorbate is the source and provides vitamin C (end product 1) and calcium (end product 2). These end products were coded separately. Supplements often contain multiple compounds, where one compound can provide multiple nutrients, and individuals can consume the same nutrient from multiple compounds, or even multiple supplements. By summing the end products that deliver the same nutrient, i.e. the ‘nutrient/ingredient’ group, the nutrient exposure derived from supplements was calculated (Fig. 3).
The majority of the manufacturers’ data was stated on the label at a nutrient level. Where the nutrient declaration contained the weight of a compound rather than the weight of the nutrient, a conversion factor was applied to retrieve the nutrient end product from the compound. This conversion factor was called ‘compound factor’ (see Table 1 for examples). To determine the compound factor, several literature sources were used, and where existing factors were unavailable, factors were determined by calculating molecular weight and the proportion of the nutrient in the compound(Reference Ashton, Ambrosini and Marks9–11). All nutrient values were entered into the database using the value and the unit of measurement as on the packaging. If this unit did not match with the unit in the ViMiS database, then the value was multiplied to retrieve a value in the default unit of measurement, called ‘unit factor’. Quantities of liquid supplements, such as cod liver oil, were multiplied by a ‘density factor’ to be able to calculate the grams of fatty acids. The compounds that have different ‘biological activities’ due to different esters or differences in natural and synthetic forms, such as vitamin E, carotenoids and folic acid, needed to be converted into standardised units in order to ensure that supplement values for nutrients were comparable among the different supplements as well as with the UK FCD(Reference Holland, Welch and Unwin12–14).
FCD, Food Composition Database; 7dDD, 7 d diet diaries; ViMiS, vitamin and mineral supplements; Fac_comp, compound factor – factor applied to the weight of the compound to derive the weight of the end product; Fac_dens, density factor – factor applied to convert the volumetric measurement to a weight (only used for oils); Fac_unit, unit factor – factor applied to convert from manufacturers’ unit to ViMiS database unit; Fac_bio, bioactivity factor – factor applied to standardise the biological activity of the vitamin.
All ingredients on the labels were entered in the ViMiS database, including ingredients for which no values were known. This was particularly needed for bioactive compounds, such as plant extracts or oils, and for supportive ingredients such as vitamin E used as an antioxidant or calcium phosphate as an anti-caking agent (known as ‘indicator compounds’). For these indicator compounds, the data were entered as a code and during the calculation process these compounds were dealt with as binary variables, i.e. absent or present.
Participant supplement data collection
Participants in the EPIC-Norfolk study were asked to record their supplement use in a 7dDD by answering the open-ended questions shown in Fig. 4. Supplement use was recorded for the same days that the 7dDD was maintained. In later versions of the diaries, an example was provided and more space was given for participants to note the supplement(s) taken, with clearer options to record multiple supplements.
Vitamin and mineral supplement data entry of participants’ supplements
The participants’ data were entered using data-entry screens made with Oracle© forms. These forms showed the same five characteristics as used for the characterisation of supplements in the ViMiS database (Fig. 2). Each supplement was recorded in a free text box, after which the data enterer found a match for the supplement name, brand name, unit and type on the alphabetically ordered directories by typing in their respective codes. Any supplement or brand name for which no match could be found in the directory was marked with a separate code. Since data entry occasionally resulted in data-entry errors, the free text box was used by a nutritionist to check the entered codes without the need to go back to the original diary. The amount of capsules, tablets, etc. was entered separately for every day of the week. There was space to enter a maximum of twelve supplements for each participant. Data enterers received training beforehand and their data entry was monitored until a satisfactory level of precision was reached.
New supplement and brand names were verified by a nutritionist using existing manufacturers’ information or the Internet and added to the directory used for data entry. For new supplement names, consistency across brands was checked. For instance, ‘Cod liver oil with added vitamins A, D and E’ could mean the same supplement as ‘Cod liver oil’ for a particular brand; however, for other brands these names referred to different supplements with different compositions, and hence they were allocated different supplement name codes. ‘Abbreviated’ supplement names were used frequently by participants, necessitating many additions to the coding directory.
Participants often forgot to mention a strength value and/or unit. This was not interpreted at the stage of data entry, but was coded up to the detail available and, where necessary, as missing data. In case the diary raised serious doubt as to whether a supplement was consumed or what this supplement was, a second source of information was used; these were the data collected during the health check-up. This would have been the day before the start of the diary and would have reflected the usual drug/supplement intake. In this way missing values were minimised wherever possible.
Vitamin and mineral supplement database: developing a database of generic supplements
During entry it became apparent that many participants had provided incomplete information about the supplement(s) that they had been using. This meant that a logical system had to be designed to fill in this missing information. The ambiguous/missing part of the supplement ID was replaced by a weighted average of the alternatives available in the ViMiS database, called a ‘generic supplement’. This ID had the same structure as all the other supplement ID, consisting of the same five characteristics. A range of generic ID was created for each supplement group to cover the various known and unknown characteristics in order to assign the best possible match to the supplement consumed by the participant (Fig. 5).
Generic ID used less-specific names than the manufacturers’ names, and for each generic ID a range of manufacturers’ supplement names were identified that could be possible matches (Fig. 6). The frequencies of these supplements were counted in the entered participants’ data, after which the weighted average of the nutrients was calculated to create the nutrient profile of the generic supplement ID.
Vitamin and mineral supplement nutrient calculations – quantitative analysis
It was important that the end result of the nutrient calculation of supplements be compatible with the nutrient calculations from the 7dDD. The first requirement of equal units of measurement was dealt with while populating the ViMiS database. The second requirement consisted of converting the (nutrient) intake per supplement into an individual average daily intake, taking dosage and frequency of use into account. These values were added to the nutrient intake from foods in order to retrieve the ‘total nutrient exposure’ for an individual.
The food consumption in the EPIC-Norfolk study was calculated by using DINER as the data-entry program and several in-house checking and calculation programs and the UK FCD(Reference Welch, McTaggart and Mulligan7). The data-entry program converted open-ended data into over 10 500 foods and appropriate portion sizes. Data were checked by nutritionists, and nutrient calculations resulted in nutrient intakes per individual per day, which were summed and divided by the number of days of recording.
During the nutrient calculations of the supplements, the entered supplement ID was compared with the database supplement ID (Fig. 7). If these supplement ID matched, the entered supplement ID would be assigned the same compounds and quantities as the database supplement ID. As a large proportion of the participants did not fully describe the supplements they used, after concatenating the entered supplement parts, the supplement ID was found to be different from the supplement ID in the ViMiS database. There were two main routes by which a non-matching entered supplement ID was dealt with.
1. Assumption: a small change was made to the entered supplement to make it fit to a supplement in the database; for example, if the type code in a cod liver oil supplement was missing, then a capsule was assumed or if names were so specific to a certain brand but the brand was missing, then that particular brand was assumed.
2. Generic: the best-fitting generic ID was assigned (see section on generic supplements, Figs 5 and 6).
In the next step, the amount and frequency were multiplied, divided by seven and this daily average was multiplied with the compounds, followed by summation of the compounds according to the ‘nutrient/ingredient group’ to which they belonged (Fig. 3). This resulted in nutrient intake on a ‘supplement’ level. The next step summed all the supplements consumed by one participant, to retrieve the ‘daily’ nutrient intake from supplements at the individual level. Finally, the nutrient intakes from foods and supplements were added together to give the ‘total’ exposure for a particular nutrient for an individual per day.
Vitamin and mineral supplement nutrient calculations – qualitative analysis
Not all supplements could be analysed quantitatively (e.g. garlic supplements, herbal supplements, pro-biotics); therefore, a qualitative system was developed by which the indicator values as well as certain compounds such as oils and herbs could be dealt with as a binary variable, indicating either ‘1’ when the compound was present or ‘0’ when the compound was absent in a supplement.
The qualitative system also included the use of specific nutrient thresholds to identify (significant) amounts of nutrients in a supplement. For example, most royal jellies contain royal jelly and honey, although occasionally vitamin E is added by some manufacturers. As a result, the weighted average of a generic supplement would always contain a small amount of vitamin E; however, it was believed to be false to mark these supplements as ‘vitamin E suppliers’, and hence the SU as a ‘vitamin E consumer’. The threshold was needed in order to deal with (very) small amounts of nutrients that are artificially created by the generic supplement system. Initially, a percentage of the Reference Nutrient Intake (RNI) values was chosen, but since not all nutrients have such a value assigned and in order to deal with all the nutrients in the same way, a percentage value of food-sourced intake served as a replacement. The distribution of nutrient intake from food diaries that had seven valid recorded days (n 16 258) served as a source of setting this threshold (Appendix 2). The threshold of 5 % is one that ignores nutrients that might not have been present in the (true) supplement consumed; however, it is also not set too high, in order to capture irregular users who would otherwise have been missed (e.g. a participant consuming 60 mg of vitamin C once weekly is, with this threshold value, still recognised as a vitamin C consumer). The threshold value of 5 % is a value that might be adjusted as experience in the analyses of the supplemental data increases.
Statistical analysis
Statistical analysis was undertaken on both the ViMiS database and SU. The supplements entered in the ViMiS database were counted and the median and interquartile range (IQR) of ‘nutrient/ingredient’ groups were calculated.
The SU and the route through which they obtained their nutrient profile were identified. The differences in the amounts of supplements reported between versions 1 and 2 of the 7dDD were tested using the χ 2 test. Finally, the supplement groups consumed, as well as the number of SU within each group, were tabulated to calculate the prevalence of supplement group consumption.
Results
The results reflect the current state of the ViMiS database, which is anticipated to be complete in the near future.
Vitamin and mineral supplement database – size
To describe all supplements to a maximum level of detail, 1394 supplement names and 158 brand/manufacturers' names were created in order to form supplement ID. To enter the ingredients on the supplement packaging, 1058 compound ingredients with their conversion factors were created and grouped into 158 nutrient/ingredient groups. As a result, the database contained 2066 supplements, dispersed over forty-five supplement groups, with a total of 16 586 compound ingredients.
Each supplement can potentially contribute a median of eleven ingredient/nutrient groups (with an IQR of 4–19) to total daily nutrient intake.
Supplement consumption
The diaries of 21 166 participants were entered, of whom 7770 (40·2 %) used 13 090 supplements, of which 4628 were distinct. Although (prescribed) drugs contribute to nutrient intake, the reasons for taking them can be different from taking supplements, which might confound future analysis. This group of SU can be easily excluded by dropping the DRG supplement group (see Appendix 1), resulting in 7515 (38·9 %) participants categorised as SU, consuming 12 524 supplements. The six most frequently consumed supplements accounted for 10 % of the entered data and were reported at least a hundred times in the diaries, whereas 24 % of the entered supplement ID were reported only once. Of the SU, 59 % used one, 23 % used two and 9 % used three supplements per day; the remaining 8 % used four to twelve supplements per day. SU filling in the second version of the diary (n 2084, 46 % multiple) recorded taking multiple supplements significantly more often compared to participants filling in the first version (n 5686, 38 % multiple, P <0·001).
Of the participants' supplements, 1501 (11 %) were a direct match with the manufacturers' database, 5399 (41 %) required assumptions and 5663 (41 %) were assigned a generic ID, of which 1878 (14 %) were without a brand specification since no nutrient data could be found for the recorded brand or the brand was missing from the diary. The remaining supplements (5 %) are still to be dealt with. The most frequently entered supplement IDs were non-specific (generic) data, for which we created 528 generic supplement ID, of which 296 were brand-specific and 232 non-brand-specific.
Supplement consumption group distribution
Table 2 shows the distribution of the supplement groups used in the EPIC-Norfolk study. Women used more supplements than men, 46·1 % and 33·9 %, respectively (data not shown). Cod liver oil supplements were the most frequently used supplements in both sexes, followed by garlic and multivitamins for men and multivitamins and evening primrose oil for women. The number of supplements reported is higher than the number of individuals within the same supplement group (Table 2). This indicates that participants are supplementing their diet with multiple supplements from the same group.
EPIC-Norfolk, European Prospective Investigation into Cancer and Nutrition in Norfolk; SU, supplement user; NSU, non-supplement user; N/A, not applicable.
Prevalence was calculated by dividing the number of participants consuming a particular supplement group by the number of participants in the cohort who answered the question relating to supplement use (SU + NSU).
The distribution of the several supplement groups consumed by SU is shown in the last two columns (see Appendix 1 for a full description of the supplement groups).
*The remaining supplement groups appeared in the following order (Appendix 1): less than 200 occurrences for YST, SB6, ZN_ and HER; less than 100 occurrences for MMS, BEE, GKO, LEC, OSC, GIN, KLP and AVA; less than 50 occurrences for GAG, COQ, SVA, FE_ and HDG; less than 25 occurrences for B12, SE_, XXX, SBU and FOL; less than 10 occurrences for MG_, BIO, PAN, AA_, CR_, SB1, SVD, OSB, MN_, K__ and SB2.
Discussion
We have described the design of the ViMiS database that can be merged with dietary data in order to calculate total nutrient exposure. The database identifies supplements by using five supplement characteristics and can be used to calculate nutrient data even if one or more of these characteristics are unknown. For 40·2 % of the EPIC-Norfolk cohort, the inclusion of supplement data from the ViMiS database increases their nutrient exposure.
We defined SU as anyone who reported taking a supplement during ≥1 d of their 7dDD. We recognise that a single, short-term measure is not ideal when analysing relationships with morbidity in the distant future, since supplement use can be easily started or stopped and frequency, dose and type can be easily changed(Reference Patterson, Neuhouser and White15). The diary, however, provides data comparable to the transcribed method(Reference Patterson, Kristal and Levy16), since participants were incouraged to include manufacturers’ labels. Hence, it provides precise, though temporal, estimates of supplement use. In addition, in the present analysis we were restricted to 1 week only, but repeat data from later time points within the EPIC-Norfolk study could be used in future to verify consistent and long-term SU. Another possible bias in the data collection occurred when the question about supplement use was changed. However, as this change happened towards the end of the first health check-up it could reflect a true increase of supplement use over time. Despite this, our data are similar compared to a UK-based study in which 45·1 % of the women used a supplement(Reference Harrison, Holt and Pattison3), and as found in national survey data, where 30 % of the men and 41 % of the women used a supplement in the years 2000–2001(1).
A disadvantage of having a label-based ingredient database as opposed to direct chemical analysis is the uncertainty surrounding vitamin composition(Reference Dwyer, Holden and Andrews17, 18). This is partly due to issues surrounding overage relating to loss of vitamins during storage, as legislation determines that a sufficient amount of the compound should be available. Potential developments, either in the database or in the statistical analysis, would be to multiply the nutrients with common overage factors, as mentioned in the report of the expert group on vitamins and minerals(18).
The development of a system of generic items for incompletely reported information on supplement consumption, as well as the assumptions made, proved to be useful since only 12 % of the participants provided sufficiently precise information to enable a direct match with the supplements in the ViMiS database. However, with the latest diary version, the instructions to participants were improved to enable reporting of the strength of a supplement. The way in which the nutrients for the generic supplement ID are created is comparable with a system developed for the Hawaiian cohort of the Multiethnic Cohort study(Reference Blitz, Murphy and Au19). Both systems have used weighted means to create a nutrient composition of a generic supplement ID, although the specificity of the generic ID in the EPIC-Norfolk study is more detailed, since more supplement groups are identified than the multivitamins and multivitamins/minerals mentioned by Blitz et al.(Reference Blitz, Murphy and Au19). This results in more specific nutrients to be included in a generic supplement. Another advantage of the method developed for the EPIC-Norfolk study is flexibility. The additions to the range of matching supplement ID can be made as soon as new manufacturers’ data are added to the ViMiS database.
The qualitative system identifies consumption of nutrients and is a useful addition for herbal supplements since the effective compound is often not known or methods for chemical analysis are far from standardised(Reference Dwyer, Picciano and Raiten20).
Conclusion
The ViMiS database system is flexible, is designed to store data at many levels and is fully compatible with the database used for the calculation of the diary data. With this information on nutrient intakes from all sources, including supplements, it will be possible to further investigate established relationships with blood biomarkers, nutrition and morbidity in the EPIC-Norfolk study(Reference Khaw, Bingham and Welch21, Reference Bingham, Luben and Welch22).
Acknowledgements
The present study was funded by the Medical Research Council, UK, and Cancer Research, UK. The authors have no conflict of interest. M.A.H.L. prepared the manuscript and has developed concepts of data entry and generic systems, cleaned data entry and assigned assumptions and generics to the data and entered manufacturers' data. A.B. created the programs for the ViMiS database. A.A.M., K.-T.K. and A.A.W. assisted in writing and revising the manuscript. A.A.W. developed the factor conversion system and started manufacturers' data collection. The authors thank Veronica van Scheltinga who entered a great proportion of the manufacturers’ data and the data enterers who entered the participants’ data, especially Sandra Owen and Crispin Philo. They also acknowledge the contribution of Professor Sheila A Rodwell (Bingham), principal investigator in EPIC-Norfolk, towards the preparation of this manuscript, and lament her untimely death in June 2009.
Appendix 1
A description of the forty-five available supplement groups in the vitamin and mineral supplement database
*DINER: Data Into Nutrients for Epidemiological Research, a data-entry system developed to enter 7 d food diaries and 24 h recalls for EPIC-Norfolk(Reference Welch, McTaggart and Mulligan7).
Appendix 2
Threshold values for transforming continuous nutrient values from supplements into a qualitative binary variables, indicating either ‘1’ when the nutrient was present or ‘0’ when the nutrient was absent
ViMiS, vitamin and mineral supplements; LRNI, lower reference nutrient intake; RNI, Reference Nutrient Intake; BW, body weight; RE, retinol equivalents; NE, niacin equivalents; TE, tocopherol equivalents.
In order to deal with all the nutrients in the same way, a percentage value of food intake was chosen. The threshold was set in order to deal with (very) small amounts of nutrients that are artificially created by the generic supplement system. For example, most royal jellies contain royal jelly and honey, occasionally vitamin E is added by some manufacturers. As a result the weighted average will always contain a small amount of vitamin E, but it was believed false to mark these participants as a ‘vitamin E consumer’. The threshold of 5 % of food intake is one that is trying to ignore nutrients that might not have been present in the supplement consumed, but it is also not set too high in order to capture irregular users who would otherwise have been missed (e.g. a participant consuming 60 mg of vitamin C once weekly is with this threshold value still recognised as a vitamin C consumer). The cut-off value of 5 % is a value that might be adjusted as experience with the supplemental data increases.
*n 13 538, diaries of participants who completed all seven days of the food diary.
†An amount of the nutrient that is enough for only the few people in a group who have low needs(23).
‡An amount of the nutrient that is enough, or more than enough, for about 97 % of people in a group(23).
§1 kcal = 4·184 kJ.