Introduction
The United States Department of Agriculture (USDA) has been the primary source of public information in agriculture for over 150 years. Many agricultural market participants and analysts share a common belief that USDA forecasts function as a benchmark for other private and public forecasts, which is not surprising given the classic public goods problem of private underinvestment in information and the critical role that public information plays in coordinating the beliefs of market participants. Public interest in this topic has further increased with the surge in communications technology, computing power, storage, and remote sensing in the last two decades, which increased the competitiveness of the private sector in generating agricultural data.
While USDA “reliably provides consistent access to critical data and information used by farmers, ranchers, policymakers and other agricultural industry stakeholders … and creates a level playing field for access to agricultural market information…, for a variety of reasons, farmer participation in statistical surveys has declined and following increased price volatility coinciding with National Agricultural Statistical Service (NASS) releases, confidence that the agency’s reports reflect market realities has faded in recent years.” (AFBF, 2021) Some of the most prominent examples include controversies surrounding USDA estimates of planted area, crop production, yields, and inventories during the 2019 crop year affected by unprecedented flooding and weather-related planting delays (Huffstutter and Polansek, Reference Huffstutter and Polansek2019). Johansson, Effland, and Coble (Reference Johansson, Effland and Coble2017) document the steep fall in producer response rates to NASS Acreage and Production surveys, with response rates falling from 80–85% in the 1990s to only 55–65% in the 2010s. These conditions have further increased public interest in the reliability of USDA reports.
The theory of optimal forecasting stretches back to at least Theil (Reference Theil1958), who developed a theoretical framework for evaluating rolling-event forecasts. His theory was extended by Mincer and Zarnowitz (Reference Mincer, Zarnowitz, Mincer and Zarnowitz1969). Clements (Reference Clements1997) and Nordhaus (Reference Nordhaus1987) introduced a framework for testing the accuracy of fixed-event forecasts. Based on this literature, the fundamental characteristics of optimal forecasts are bias, accuracy, and efficiency (e.g., Diebold and Lopez, Reference Diebold, Lopez, Diebold and Lopez1998; Runkle, Reference Runkle1991). Over time additional features have been added and explored, such as encompassingFootnote 1 and informativenessFootnote 2 . This resulted in an extensive body of knowledge that examined various characteristics of USDA reports, highlighted their benefits and shortcomings, and offered areas of improvement. However, to the best of our knowledge, this literature has not been systematically and comprehensively reviewed.
The goal of this study is to systematically review the literature on the bias, accuracy, and efficiency of USDA forecasts and critically assess their methods and findings. Our study provides an independent assessment of the state of knowledge regarding the quality of USDA information to inform the public and resolve misunderstandings about their reliability. Our findings also highlight potential gaps in the state of knowledge and suggest opportunities for further research. Because this literature is very extensive, we only focus on the studies of USDA forecast accuracy and efficiency.
Studies of the economic value and market impact of USDA information are not included in this evaluation as they have been recently surveyed by Isengildina Massa, Karali, and Irwin (Reference Isengildina Massa, Karali and Irwin2023). Similar to this study, they found that corn and soybean markets and WASDE and Crop Production reports received the most attention in the literature. In terms of approaches, futures market impact studies, including unconditional and conditional market reactions, were used most often, while welfare studies and market impact studies using options prices were less common. The authors demonstrated extensive evidence of market impact associated with situation or inventory related reports, with limited information regarding the market impact of outlook reports. However, while outlook reports do not tend to move the markets, they are likely valuable in helping to level the playing field and decrease information asymmetries among market participants. This value would be determined by the accuracy and reliability of the USDA forecasts, especially relative to alternative information, which will be examined in the current study.
USDA Forecasts
USDA releases over 400 different reports every year that cover different topics, from crop production to farm income to food inflation. The focus of this study is on reports and forecasts associated with commodity situation and outlook, as these reports have been shown to have the most value and impact on agricultural markets (Isengildina Massa, Karali, and Irwin, Reference Isengildina Massa, Karali and Irwin2023). The purpose of these reports goes back to the historical mission of USDA to provide information to help farmers make better production and marketing decisions.
USDA’s crop production forecasts (especially for corn and soybeans) received the most attention in the previous literature as these reports are well-known to cause significant market reactions. NASS is the USDA agency primarily responsible for crop production reports that include Prospective Plantings (released annually at the end of March), Acreage (released annually at the end of June), Crop Production (released monthly depending on production cycle), Crop Conditions (released weekly depending on production cycle), Winter Wheat Seedings (released annually in December) as well as Crop Production Annual Summary (released annually in January) and Small Grains Annual Summary (released in September). Crop production forecasts released in the World Agricultural Supply and Demand Estimates (WASDE) reports prepared by the World Agricultural Outlook Board (WAOB) during the months when NASS forecasts are not available, follow time-series-based methods. The sampling and survey methodologies behind these forecasts were described by Vogel and Bange (Reference Vogel and Bange1999). In addition, many studies focused on both the methodology and evaluation of these forecasts. For example, Good and Irwin (Reference Good and Irwin2006, Reference Good and Irwin2011, 2013), Irwin, Sanders, and Good (Reference Irwin, Sanders and Good2014a), Irwin, Good, and Sanders (Reference Irwin, Good and Sanders2015), and Irwin and Good (Reference Irwin and Good2016) explain the methodology behind USDA crop yield forecasts and argue that USDA should “open the black box” behind their forecasts and become more transparent about their methods and any changes to their approaches in order to build trust and support across producers and encourage them to participate in the surveys on which these forecasts are built.
USDA livestock production forecasts are published within WASDE reports. For example, USDA estimates for quarterly production for beef, pork, and poultry normally begin in August of the preceding year, about 17 months before December of the year for which the estimate is being made. Actual or final production levels are published in Livestock, Dairy, and Poultry Outlook reports. Cattle on Feed and Hogs and Pigs reports that provide data such as placements, marketings, and intended farrowings are considered inventory reports.
USDA price forecasts are available from various sources. For example, seasonal average price forecasts for various commodities are published in WASDE reports. Monthly crop prices are published within various outlook reports (e.g., Rice Outlook, Oil Crops Outlook, and Wheat Outlook, etc.). Cattle and hog price forecasts are published in Livestock and Poultry Situation and Outlook and Livestock, Dairy, and Poultry Outlook reports.
In addition to production and price forecasts published as part of the WASDE reports, the WASDE balance sheets include several other supply and demand forecasts that have been evaluated in previous studies. For example, figure 1 shows how the WASDE forecasting cycle for cotton corresponds to the marketing year, stages of the production cycle, and the release of other reports from NASS and the Census Bureau. This figure demonstrates that most agricultural forecasts are fixed-event forecasts, with a series of forecasts of the same target event, frequently defined as a marketing year average or total value, where marketing year definitions are commodity-specific based on the production cycle.
Grain Stocks reports are widely used by the industry to gauge the pace of domestic use based on how much crop was still left in storage. These reports are issued by NASS quarterly, at the beginning of January and at the end of March, June, and September, and describe stocks of multiple commodities in storage as of the beginning of these months (December 1 for January report). Thus, each report provides a survey-based snapshot of commercial and on-farm stocks at various points in time and shows how these stocks change during the marketing year. The challenge with evaluating the accuracy of these reports is the lack of the “final value” as the estimates are subject to revision only if new information becomes available in the quarter following initial publication and again following the December Grain Stocks report published in January each year.
USDA’s forecasts of net farm income (NFI) and net cash farm income (NCI) include measures of receipts from various agricultural operations, direct government payments, operating expenses, as well as taxes and fees, capital consumption and payments to stakeholders (NFI only) incurred during the calendar year. These forecasts of the annual NFI and NCI estimates are released during 18 months (forecast horizon) leading up to the release of the first official estimate, as displayed in figure 2. Despite their importance, NFI and NCI forecasts have not been evaluated rigorously until recently as the early studies had to collect these data by hand. The Farm Income and Wealth Statistics Forecast and Estimate Data Archive (https://www.ers.usda.gov/data-products/farm-income-and-wealth-statistics/) that became available from the USDA’s Economic Research Service (ERS) in April of 2022, spurred a burst of new studies of these forecasts.
Long-term agricultural baseline forecasts are fixed-horizon, 10-year path projections that provide dynamic information along their paths. Figure 3 shows baseline projections against realized values for the harvested corn area. Unlike fixed-event agricultural forecasts, which aim at predicting future values of the variables of interest, long-term projections are not intended to be a forecast of what the future value will be but offer a conditional, long-run scenario about what would be expected to happen under a continuation of current farm legislation and other specific assumptions (USDA Office of Chief Economist, 2023). USDA baseline projections are produced by the Interagency Agricultural Projections Committee, but the ERS takes the lead role. The projections reflect a composite of model results and judgment-based analysis. Hjort et al. (Reference Hjort, Boussios, Seeley and Hansen2018) provides a detailed description of the USDA baseline model and various processes followed during the preparation of the baseline report, which is initiated in August–September and published in February of every year.
Review Methods and Findings
The design of this systematic literature review follows the guidelines of Siddaway et al. (Reference Siddaway, Wood and Hedges2019). The main search for literature was conducted in September 2021, and an update was made in July, 2023. Searches for relevant peer-reviewed literature were made using two online publication databases; EBSCOHOST and Web of Science. The search terms utilized during the literature search included (“United States Department of Agriculture” OR USDA OR “Department of Agriculture”) AND (report OR forecast OR outlook OR reports OR forecasts OR projection) AND (accuracy OR evaluation OR efficiency OR rationality). We focused on academic journal publications and reports in English published during 1990–2023. After deleting duplicates, 579 publication records were collected. These studies were screened for relevance as we wanted to focus on the studies that examined USDA reports containing commodity situation and outlook forecasts and estimates. The bibliographies of these publications were also explored, and potentially relevant studies not found in online databases were also recorded. In addition, the search was complemented using AgEcon Search (https://ageconsearch.umn.edu/) and Farmdoc (https://farmdocdaily.illinois.edu/) websites. Finally, the resulting list of studies was reviewed for completeness by Scott H. IrwinFootnote 3 to identify any missing studies. The resulting sample consisted of 77 studies deemed relevant for this review. Most of these studies (48) were published in peer-reviewed academic journals, six studies were published in scientific reports, five were conference papers, and 18 were farmdoc articles.
Figure 4 shows the number of published studies on USDA reports over time. A big spike in 2014 is associated with several farmdoc articles discussing USDA’s grain stocks estimates. Overall, there is a growing trend in the number of studies on USDA report accuracy over time with an average of about four studies a year in the last decade. Figure 5 shows the distribution of published studies across topics based on report type. This figure demonstrates that the accuracy of crop production forecasts has received the most scrutiny, followed by price and WASDE forecasts, while the evaluation of other types of forecasts (livestock production, baseline, and farm income) received less attention. In terms of commodities, corn and soybeans forecasts received the most attention with 46 and 37 studies, respectively, addressing them. Wheat forecasts were investigated in 19 studies, followed by hogs (15) and cattle (10). Less studied commodities include cotton (6), poultry (5), milk (3), rice and eggs (2) and sugar (1). We also included seven studies that investigated more general forecasts like farm income or exports that were not commodity-specific.
This review revealed a wide variety of methods that have been used for analyzing the accuracy and efficiency of USDA forecasts. Forecast evaluation is typically focused on forecast errors expressed in either raw units (realized – predicted) or percentages ((realized – predicted)/realized) or log percentages (100*(ln(realized)-ln(predicted)) to control for changes in levels of the forecasted variables over time. It is important to note that the findings of these studies can be very different based on the forecasts examined, commodity, sample period, and methodology. Due to this heterogeneity we decided against summarizing the findings using a meta-analysis. Meta-analysis is appropriate “when the reviewer wishes to bring together many studies that have empirically tested the same hypothesis.” (Siddaway et al., Reference Siddaway, Wood and Hedges2019, p. 754) Instead, we focus on various measures of rational forecasts assessed for these reports and discuss the findings and implications of this research.
Are USDA Forecasts Biased?
Tests of bias examine whether positive and negative forecast errors cancel out and the average forecast errors equal zero. Traditionally, bias has been evaluated with either a t-test, or a regression-based test in which the forecast error is regressed against a constant (Holden and Peel (Reference Holden and Peel1990). While these approaches are equivalent, the benefit of the regression-based approach is the ease of correction for heteroskedasticity and autocorrelation using Newey and West (Reference Newey and West1987) standard errors. Variations of this test include a trend variable to assess whether bias has changed over time. Other studies (e.g., Sanders and Manfredo, Reference Sanders and Manfredo2007) have applied what is widely known as a Mincer-Zarnowitz equation to assessing forecast bias. In this approach, a realized value is regressed against the constant and the forecast. This regression tests whether forecasts are unbiased (the coefficient for the constant is zero) and properly scaled (the coefficient for the forecast is one). However, the estimation of this equation may encounter statistical challenges, especially when there is lack of stationarity in either realized values or forecasts.
Table 1 shows the summary of empirical findings regarding bias in USDA forecasts. It appears that underestimation was the most common form of bias in USDA forecasts, as it was found in some corn and soybean yield forecasts, hog and cattle production forecasts, corn, soybean, wheat, and hog price forecasts, soybean and wheat export forecasts, farm income forecasts, and baseline harvested area forecasts for soybeans and corn. This tendency is likely due to USDA analysts underestimating long-term growth rates in these variables. On the other hand, some forecasts performed really well, showing a lack of bias over time in forecasts of corn yield, soybean and wheat acreage, more recent livestock production, rice price, as well as sugar and cotton WASDE forecasts, among others.
It is important to note that most of these studies implicitly assume that public forecasters minimize a symmetric linear or quadratic loss function, therefore positive and negative forecast errors should cancel out. Only one study by Bora, Katchova, and Kuethe (Reference Bora, Katchova and Kuethe2020) explored the possibility of asymmetric loss functions for USDA forecast providers with different weights placed on over- or under-prediction. They argued that “USDA is averse to overpredicting net cash income at the early stages of the forecasting process…USDA has a higher cost overpredicting both price and yield for corn, soybeans, and wheat.” Thus, under an asymmetric loss function, these biased forecasts would be considered optimal.
How Accurate Are the USDA Forecasts?
Traditional measures of forecast accuracy assess the magnitude of forecast errors regardless of sign, such as the mean absolute errors and the root mean squared errors (RMSE). Studies also evaluated the changes in forecast accuracy over time by regressing absolute forecast errors against a constant and a time trend. For example, Table 2 demonstrates that several studies (e.g., Irwin and Good, Reference Irwin and Good2011a; Irwin, Good, and Sanders, Reference Irwin, Good and Sanders2014 a, b) found that corn and soybean production forecasts were consistently accurate with evidence of improvements in the accuracy of corn yield forecasts. Improvements in accuracy were also found in hog and cattle production forecasts (Bailey and Brorsen, Reference Bailey and Brorsen1998), China cotton WASDE forecasts (Isengildina Massa, MacDonald, and Xie, Reference Isengildina-Massa, MacDonald and Xie2012), and wheat ending stocks forecasts (Xiao, Hart, and Lence, Reference Xiao, Hart and Lence2017). On the other hand, the accuracy of soybean, wheat and rice price forecasts appeared to decline (No and Salasi, Reference No and Salassi2009). Most studies also found that fixed-event forecast errors demonstrate a pattern of errors decreasing across the forecast horizon as more information becomes available (e.g., Isengildina-Massa, Karali, and Irwin, Reference Isengildina-Massa, Karali and Irwin2013; Isengildina-Massa, MacDonald, and Xie, Reference Isengildina-Massa, MacDonald and Xie2012).
> indicates that the USDA forecast is better, more accurate than the alternative, < indicates the USDA forecast is worse, = indicates similar accuracy between the USDA and alternative forecast.
Some studies also used a directional accuracy test developed by Henriksson and Merton (Reference Henriksson and Merton1981). The test is based on 2 × 2 contingency tables, reflecting the direction of year-to-year change in each variable forecast for each stage’s average forecast. The frequency with which forecasts and actual realizations of the variable decrease or increase together is compared with the expected frequency of independent directional changes using a Chi-squared statistic. The results of this evaluation reflect a proportion of time the forecast correctly predicts the directional change in the realized value. For example, Sanders and Manfredo (Reference Sanders and Manfredo2003) found that USDA correctly identified the direction of price change in at least 70% of its forecasts over 1982-2002. No (Reference No2007) found that USDA hog price forecasts have a lower accurate forecast ratio and higher worst forecast ratio than the forecasts of the time-series model, suggesting weaker directional accuracy in the USDA model. Directional accuracy test results in Isengildina Massa, MacDonald, and Xie’s (Reference Isengildina-Massa, MacDonald and Xie2012) study highlighted the difficulty the USDA faces in forecasting China’s domestic use, exports, and ending stocks for cotton.
Accuracy measures were also used to compare the accuracy of forecasts in question to a certain benchmark, such as a naïve forecast alternative (using Theil’s U statistic), a time-series forecast, or another alternative forecast. Modified Diebold Mariano test (Harvey, Leybourne, and Newbold, Reference Harvey, Leybourne and Newbold1997) was typically used to determine whether the difference between the accuracy of two alternative forecasts is significantly different from zero. The results of this evaluation approach indicate which forecast is more accurate relative to the included alternative. Most of the previous studies conducted accuracy evaluation separately for each forecast horizon, since fixed-event forecasts are expected to become more accurate across the forecasting cycle as more information becomes available. Some recent studies proposed methods to compare the relative accuracy of path forecasts (e.g., Patton and Timmerman, Reference Patton and Timmermann2012). For example, Bora, Katchova, and Kuethe (2022) use the tests of multi-horizon superior predictive ability proposed by Quaedvlieg (Reference Quaedvlieg2021) that jointly consider all horizons along the entire projection path. These tests evaluate the average predictive ability for a path forecast with a larger loss at some horizons that is compensated by more accurate performance at other horizons when compared to the alternative path forecast.
Table 2 provides a summary of empirical findings regarding the accuracy of USDA forecasts and demonstrates that USDA’s cattle, poultry, milk, and egg production forecasts were more accurate than a naïve alternative (Sanders and Manfredo, Reference Sanders and Manfredo2008). Hog and cattle baseline export forecasts also tend to outperform the naïve alternative, but the results are mixed for other baseline forecasts (e.g., Bora, Katchova, and Kuethe, Reference Bora, Katchova and Kuethe2023; Luke and Tonsor, Reference Luke and Tonsor2023; Regmi, Kuethe, and Foster, Reference Regmi, Kuethe and Foster2022). Hog, cattle, and poultry price forecasts were shown to be more accurate than time-series forecasts (Sanders and Manfredo, Reference Sanders and Manfredo2003). Elam and Holder (Reference Elam and Holder1985) found that USDA rice forecasts had smaller errors than random walk model forecasts. Sanders and Manfredo (Reference Sanders and Manfredo2005) found that USDA forecasts were statistically more accurate than competing times series forecasts for fluid milk.
Futures-based forecasts were frequently used as a comparison benchmark for price forecastsFootnote 4 and the evidence is mixed with corn and milk price forecasts appearing more accurate, while hog and cattle price forecasts performing comparable or less accurately to futures (e.g., Colino and Irwin, Reference Colino and Irwin2010; Hoffman et al., Reference Hoffman, Etienne, Irwin, Colino and Toasa2015; Irwin, Gerlow, and Liu, Reference Irwin, Gerlow and Liu1994; Manfredo and Sanders, Reference Manfredo and Sanders.2004; Irwin and Good, Reference Irwin and Good2015). For example, Irwin, Gerlow, and Liu (Reference Irwin, Gerlow and Liu1994) found that there were no significant differences between the forecasting performance of live hog and live cattle futures and USDA expert predictions over 1980–1991. Manfredo and Sanders (Reference Manfredo and Sanders.2004) found that at horizons less than six months, the lean hog futures-based forecasts were significantly more accurate than both the USDA and Extension Service forecasts. On the other hand, Hoffman et al. (Reference Hoffman, Etienne, Irwin, Colino and Toasa2015) found that WASDE corn price projections had significantly smaller errors relative to futures-adjusted forecasts in 4 out of 16 forecasting periods. Franken et al. (Reference Franken, Irwin and Etienne2018) examined information transmission between hog futures and expert price forecasts and found that expert forecasts were substantially influenced by futures prices but also had an impact on both futures and cash prices as well.
Several studies assessed the accuracy of USDA price forecasts relative to their extension counterparts as well as futures prices. For example, Kastens et al. (Reference Kastens, Schroeder and Plain1998) reported that extension forecasts were more accurate than USDA forecasts for livestock, but not for crops. However, these forecasts were generally not more accurate than futures-derived forecasts. Colino and Irwin (Reference Colino and Irwin2010) concluded that outlook forecasts of hog and cattle prices from extension and USDA provide incremental information relative to futures prices based on RMSE comparisons. Colino et al. (Reference Colino, Irwin and Garcia2012) investigated whether the accuracy of outlook hog price forecasts can be improved using composite forecasts and found that futures and equally weighted composite procedures improve the accuracy of outlook forecasts, but naïve no-change forecasts are less accurate than outlook forecasts.
Private forecasts are some of the most important benchmarks for USDA forecasts as they directly address the issue of the relative accuracy of private versus public information and the relevance of public information in the presence of private sources. However, private forecasts are only available for a limited number of forecasts and commodities: acreage, production, and grain stocks for corn, soybeans, and wheat; Hogs and Pigs reports; and Cattle on Feed reports; highlighting the areas of perceived highest value by private forecasters. Several studies assessed the accuracy of USDA crop production forecasts relative to private forecasts (e.g., Egelkraut et al., Reference Egelkraut, Garcia, Irwin and Good2003; Good and Irwin, Reference Good and Irwin2006; Isengildina Massa, Karali, and Irwin, Reference Isengildina-Massa, Karali and Irwin2020). For example, Isengildina Massa, Karali, and Irwin (Reference Isengildina-Massa, Karali and Irwin2020) demonstrated that in the vast majority of cases, USDA forecasts were more accurate than their private counterparts. The accuracy domination of the USDA forecasts was most consistent in corn, largest in wheat, and least prevalent in soybeans. Specifically, the authors found consistent accuracy advantages of USDA in Prospective Plantings, Acreage, and October Crop Production forecasts for corn. On the other hand, the only evidence of private forecasts dominating the USDA was found for August corn Production during the 1990s and early 2000s. However, it appears that USDA has regained its advantage in August corn production forecasts since the mid-2000s. The authors concluded that it is important to maintain the response rates to USDA surveys and combine them with other data to ensure high quality of these forecasts, especially as extensive growth in remote sensing technology may increase competition from the private sector and deteriorate USDA’s advantage.
Are USDA Forecasts Efficient?
Studies of weak-form efficiency, shown in Table 3, typically examined whether forecast errors were orthogonal to information available at the time the forecasts are made, such as forecasts themselves and prior forecast errors. For example, Runkle (Reference Runkle1991) concluded that hog production forecasts did not include previous information. Sanders and Manfredo (Reference Sanders and Manfredo2002) showed that beef, pork, and broiler production forecasts repeated past errors. Sanders and Manfredo (Reference Sanders and Manfredo2003) demonstrated that cattle and broiler price forecasts also repeated past errors, while hog price forecasts did not. Price forecasts for hogs, turkeys, eggs and milk also tended to repeat price errors (Sanders and Manfredo, Reference Sanders and Manfredo2007). No and Salassi (Reference No and Salassi2009) found that soybean, wheat, and rice price forecasts were correlated with past prices. Isengildina-Massa, MacDonald, and Xie (Reference Isengildina-Massa, MacDonald and Xie2012) revealed that forecasts for China’s cotton production, domestic use, and exports were frequently correlated with past errors, while U.S. cotton domestic use forecasts were correlated with forecast levels.
Some studies also assessed orthogonality to other information available at the time the forecasts were made. For example, Isengildina-Massa, Karali, and Irwin (Reference Isengildina-Massa, Karali and Irwin2013) demonstrated that WASDE forecasts for corn, soybeans, and wheat did not efficiently incorporate macroeconomic information and forecast errors included some behavioral sources. They found that corn, soybean, and wheat forecast errors grew during periods of economic growth and with changes in exchange rates, while inflation and changes in oil price had a much smaller impact. For the behavioral sources, they identified patterns consistent with leniency and pessimism across different categories. Regmi, Kuethe, and Foster (Reference Regmi, Kuethe and Foster2022) examined macroeconomic sources of errors in baseline export forecasts.
By far the most common form of inefficiency assessed in USDA reports is the correlation of revisions of fixed-event forecasts (Nordhaus, Reference Nordhaus1987). According to Nordhaus, if forecasts are weak-form efficient, revisions should follow a random walk. If revisions are correlated, or predictable, it means that forecasts themselves are partially predictable and therefore inefficient. Positive correlation in revisions, described as “smoothing,” has been demonstrated repeatedly in USDA crop production forecasts. For example, Isengildina, Irwin, and Good (Reference Isengildina, Irwin and Good2006) showed that revisions of NASS corn and soybean production forecasts over 1970–2004 were sometimes positively correlated and directionally consistent. This pattern of predictability in production forecast revisions may be due to a conservative bias in farm operators’ assessments of yield potential and in the procedure for translating enumerators’ information about plant fruit counts into objective yield estimates. The authors argued that losses in forecast accuracy due to smoothing were statistically and economically significant. Sanders et al. (Reference Sanders, Altman, Manfredo and Anderson2009) investigated forecast smoothing in the USDA’s cotton production forecasts and demonstrated how forecasting practitioners and farm managers should correct these forecasts.
Isengildina, Irwin, and Good (Reference Isengildina, Irwin and Good2013) found that although the pattern of smoothing may appear obvious to market analysts in hindsight, it is difficult to anticipate. In other words, one would need to know that a big crop year is expected to apply the pattern of “big crop getting bigger” to crop production revisions. Irwin, Good, and Newton (Reference Irwin, Good and Newton2014) updated and extended this analysis to show that historically not all big crops got bigger and the challenges with anticipating the size of the 2014 crop during August. Nevertheless, Isengildina-Massa, Karali, and Irwin (Reference Isengildina-Massa, Karali and Irwin2017) showed that market participants appear to be aware of smoothing and adjust for it in forming their price expectations. This was also demonstrated by Xie, Isengildina-Massa, and Sharp (Reference Xie, Isengildina-Massa and Sharp2016), who developed a statistical procedure for the correction of smoothing in corn, soybean, wheat, and cotton production forecasts and demonstrated potential improvements in accuracy resulting from this correction.
Other studies found smoothing in WASDE sugar forecasts (Lewis and Manfredo, Reference Lewis and Manfredo2012), corn WASDE and ending stocks forecasts (Good and Irwin, Reference Good and Irwin2014), soybean WASDE forecasts (MacDonald and Ash, Reference MacDonald and Ash2016; MacDonald, Ash, and Cooke, Reference MacDonald, Ash and Cooke2017), WASDE ending stocks forecasts for corn, soybeans, and wheat (Xiao, Hart, and Lence, Reference Xiao, Hart and Lence2017) and hog baseline import forecasts (Luke and Tonsor, Reference Luke and Tonsor2023). On the other hand, some evidence of negative correlations in revisions, described as “jumpiness” or over-reaction to new information has been found for cattle baseline export forecasts (Luke and Tonsor, Reference Luke and Tonsor2023), some farm income forecasts (Kuethe, Hubbs, and Sanders, Reference Kuethe, Hubbs and Sanders2018), and beef, pork, and broiler production forecasts (Sanders and Manfredo, Reference Sanders and Manfredo2002).
It is important to note that the interpretation of correlations in revisions has changed over time. While the term “smoothing,” tends to suggest a strategic behavior of forecast providers, Goyal and Adjemian (Reference Goyal and Adjemian2023) argued that correlated revisions may also be explained by information rigidities that cause forecasts to be infrequently or only partially updated. The authors applied a framework developed by Coibion and Gorodnichenko (Reference Coibion and Gorodnichenko2015) and demonstrated that information rigidities, rather than smoothing, are the most likely cause of correlations in crop production revisions due to production and yield data being either too costly to obtain or too noisy. Therefore, the authors concluded that improving these forecasts should be based on better access to crop data.
Do USDA Forecasts Encompass Other Information?
A test of forecast encompassing was developed by Harvey, Leybourne, and Newbold (Reference Harvey, Leybourne and Newbold1998) and has been used in previous studies of USDA forecasts shown in Table 4 as an additional evaluation of relative accuracy. If a preferred forecast encompasses an alternative forecast, then the alternative forecast provides no useful information beyond that provided in the preferred forecast. This test is based on a regression used to evaluate the covariance between the preferred forecast error series (the dependent variable) and the difference between the preferred and alternative forecast error series (the independent variable). If this covariance is zero, the preferred forecast is said to encompass the competing one.
> indicates that the USDA forecast is better, more accurate than the alternative, < indicates the USDA forecast is worse, = indicates similar accuracy between the USDA and alternative forecast.
This test has been used extensively to demonstrate whether additional information, such as time series forecasts (e.g., Sanders and Manfredo, Reference Sanders and Manfredo2002, Reference Sanders and Manfredo2003, Reference Sanders and Manfredo2004) or futures-based forecasts (e.g, Manfredo and Sanders, Reference Manfredo and Sanders.2004; Colino and Irwin, Reference Colino and Irwin2010; Hoffman et al., Reference Hoffman, Etienne, Irwin, Colino and Toasa2015; Sanders and Manfredo, Reference Sanders and Manfredo2005), may help improve USDA forecasts. If this additional information is not encompassed in the USDA forecast, a combination forecast that will be more accurate may be constructed. For example, Hoffman et al. (Reference Hoffman, Etienne, Irwin, Colino and Toasa2015) argued that composite forecasts combining WASDE and futures-based forecasts would reduce corn price forecast errors by an average of 12–16 percent.
Bora, Katchova, and Kuethe (Reference Bora, Katchova and Kuethe2023) applied a multi-horizon comparison approach developed by Quaedvlieg (Reference Quaedvlieg2021) to test the predictive content of the USDA and the Food and Agricultural Policy Research Institute (FAPRI) baseline projections. The authors found that neither USDA nor FAPRI projections had superior predictive ability over the other for most variables, except for the farm-related income, where FAPRI performed better than the USDA, and corn price and soybean yield, where USDA performed better than the FAPRI. These findings suggest that collaboration between different agencies may be helpful to improve these projections.
Are USDA Forecasts Informative?
The informativeness of USDA forecasts is another aspect that has been explored in previous studies. The test of informational value developed by Baur and Orazem (Reference Baur and Orazem1994) examines whether USDA forecasts bring the market closer to equilibrium. This test is implemented by first estimating the “partial information” equation, where the final estimate is regressed against market expectations available a few days before the USDA report release, to obtain a measure of the forecast variance (the adjusted R-squared). Then, the “full information” equation is estimated by adding the USDA forecast to the partial information equation. Informational value, as measured by the reduction in forecast variance, is obtained as the difference in the adjusted R-squared values obtained from the partial and full information equations. Table 5 shows that in most cases, studies that implemented this test (Garcia et al., Reference Garcia, Irwin, Leuthold and Yang1997; Isengildina-Massa, Karali, and Irwin, Reference Isengildina-Massa, Karali and Irwin2020) found evidence of at least some informational value of USDA crop production forecasts.
“yes” refers to presence of informational value in the forecasts. Number reflects the number of horizons at which these forecasts are informative.
Another test developed by Vuchelen and Gutierrez (Reference Vuchelen and Gutierrez2005) was used in some studies (e.g., Sanders and Manfredo, Reference Sanders and Manfredo2008) for the evaluation of information content across multiple horizons. This test establishes the contribution of longer-horizon forecasts relative to the information contained in shorter-horizon forecasts. For example, Sanders and Manfredo (Reference Sanders and Manfredo2008) found that turkey and milk production forecasts were informative for up to three periods (quarters) ahead, while egg production forecasts were not informative at even the current horizon.
Some recent studies (e.g., Bora, Katchova, and Kuethe, Reference Bora, Katchova and Kuethe2023; Luke and Tonsor, Reference Luke and Tonsor2023) used the test developed by Breitung and Knüppel (Reference Breitung and Knüppel2021) to determine the maximum informative projection horizon by comparing the projections’ mean-squared prediction errors to the variance of the evaluation sample. The benefit of this test is that it circumvents the need to compare projections to naïve benchmarks and instead compares prediction errors to the variance of realized values. The findings of Bora, Katchova, and Kuethe (Reference Bora, Katchova and Kuethe2023) demonstrate that the predictive content of the baseline projections for most variables diminished after four to five years, as shown in Table 5. On the other hand, Luke and Tonsor (Reference Luke and Tonsor2023) found that only pork exports were informative for up to two years ahead, while most other import and export forecasts were uninformative at even the current horizon. These findings raise concerns about using these forecasts for decision-making.
Performance of USDA Forecast Systems
As some USDA forecasts are released as part of a joint system (e.g. WASDE forecasts, farm income forecasts), it is important to consider not just the quality of its individual components but also the joint accuracy of the system as a whole since various components interact with each other. Table 6 shows a summary of studies that addressed these issues. Most studies focused on the interaction of various components through evaluation of the residual variable (ending stocks for WASDE, NCI or NFI for farm income). For example, Botto et al., Reference Botto, Isengildina, Irwin and Good2006) examined correlations between ending stocks and price forecast errors with the errors of the other balance sheet categories from WASDE reports for corn and soybeans and determined that errors in production and export (export and feed use) forecasts were the main drivers of errors in soybean (corn) price forecasts. On the other hand, almost all categories were significant in explaining the forecast errors in ending stocks.
MacDonald and Ash (Reference MacDonald and Ash2016) examined the sources of upward bias in U.S. soybean ending stocks forecasts and found that soybean export forecasts were the most likely source of this bias. Xiao, Hart, and Lence (Reference Xiao, Hart and Lence2017) found similar evidence and concluded that “Concerns, such as those voiced by the soybean industry, that the USDA ending stock estimates were not adequately capturing the export demand growth resulting in higher ending stock estimates and lower crop prices likely have some merit.” (p.239) For farm income forecasts, Isengildina-Massa et al. (Reference Isengildina-Massa, Karali, Kuethe and Katchova2019) looked for sources of errors in NCI forecasts released as part of the farm sector’s income statement. The authors found that errors in expenses and livestock and crop receipts were the largest contributors to NCI forecast errors.
While most of the previous studies used regression analysis to explore the sources of errors in residual categories, Goyal et al. (Reference Goyal, M.K., Glauber and Meyer2023) used machine learning tools to decompose USDA’s ending stocks forecast errors for corn, cotton, soybeans, and wheat. The authors demonstrated that export and production errors are the key contributors to ending stocks forecast errors, as shown in figure 6. The authors also linked USDA’s export errors to production and export levels in China, Mexico, Brazil, and the European Union. Their findings suggest that better information about production expectations, both domestically and worldwide would lead to more efficient WASDE balance sheet forecasts.
Another innovation was focused on the joint evaluation of the forecasting system. For example, Isengildina-Massa et al. (Reference Isengildina-Massa, Karali, Kuethe and Katchova2021) used a test developed by Sinclair, Stekler, and Carnow (Reference Sinclair, Stekler and Carnow2015) which combines the single accuracy measure for each component of the joint forecasts into a vector. Specifically, it focuses on the difference (Mahalanobis distance) between the mean vectors of forecasts and outcomes while allowing for scale differences across different variables and a nonzero correlation between variables. The rationale behind this test is that if a vector of forecasts is similar to the vector of the outcomes, it can be substituted for the actual data for decision-making. They found that despite the observed biases and inefficiencies, USDA’s farm income forecasts were compositionally consistent with the actual outcomes and represent realistic projections of the farm sector accounts. Kuethe, Bora, and Katchova (Reference Kuethe, Bora and Katchova2021) used the same methodology to demonstrate that while the ERS farm income forecasts were accurate, baseline farm income forecasts were not.
Evaluation of Uncertainty in USDA Forecasts
Finally, it is important to note that all of the above methods have been developed and applied to point forecasts, which is how most of the USDA projections are published. Only price forecasts have been traditionally published as intervals. Most of the earlier studies, with some exceptions, reduced these ranges to their midpoint for evaluation. Sanders and Manfredo (Reference Sanders and Manfredo2003) were the first to assess the probability of the final price falling within the forecast interval range. Isengildina, Irwin, and Good (Reference Isengildina, Irwin and Good2004) expanded on this issue to implement a formal evaluation of USDA corn and soybean price forecasts as intervals based on the framework of Christoffersen (Reference Christoffersen1998). They found that corn price forecast intervals contained the final value about 50% of the time during the pre-harvest period and about 77% after harvest. Soybean price forecasts contained the final value about 73% of the time pre-harvest and about 80% after harvest. While these forecasts were not calibrated at the conventional 95% confidence levels, soybean price forecasts were calibrated at levels implied by forecast providers. Isengildina-Massa and Sharp (Reference Isengildina-Massa and Sharp2012) argued that while these price ranges were constructed symmetrically, the distribution of forecast prices is asymmetric. When asymmetry was considered, calibration results changed for several interval forecasts of corn, soybean, and wheat prices, suggesting that these forecasts were asymmetric but accurate. Independence tests revealed that USDA published similar ranges in both volatile and tranquil times, indicating that these ranges did not adequately reflect uncertainty in the forecasts. Despite the value of forecast intervals in communicating uncertainty, USDA chose to eliminate these ranges in favor of point estimates in 2019. Instead, forecast users are provided with reliability statistics for some of the estimates listed at the end of each WASDE report, which includes RMSE and 90% confidence intervals for historical estimates, among other metrics.
Conclusions and Implications
Because of their importance and value to market participants, the accuracy of USDA forecasts has received substantial attention in academic literature. This study reviewed almost 40 years of research regarding USDA forecasts’ optimality. We found extensive evidence of the strengths of USDA forecasts relative to various benchmarks in terms of accuracy, efficiency, encompassing, and informativeness. However, areas for improvement were also highlighted in the previous studies, suggesting that these forecasts are not optimal across all measures. It is important to note that the ultimate goal in this strain of research is to ensure the highest quality of these forecasts. Therefore, if any shortcomings of forecast optimality were found, they lead to follow-up studies on how the issues can be overcome and improved.
It is difficult to combine the findings of the previous studies due to heterogeneity in methodologies, forecasts, commodities and sample periods. In the most general terms, figure 7 shows that evidence of bias was most common for forecasts related to corn markets (63% of analyses in Table 1)Footnote 5 and least common in soybean forecasts (33% of analyses). Figure 8 shows the average percent errors for USDA reports across the subset of studies that reported these measures.Footnote 6 This figure illustrates that the bias in corn forecasts was split almost equally between over and under estimation with the magnitude of overestimation (negative errors) appearing larger for early yield, exports and ending stocks forecasts. For soybeans, any evidence of bias is overshadowed by a large overestimation in August ending stocks forecasts. In wheat, most of the evidence of bias is associated with underestimation. This raises the question of whether observed underestimation was caused by the inability of analysts to accurately predict growth rates or by their “aversion” to overprediction. That is, do public forecasters minimize a symmetric or asymmetric loss function? Do behavioral reasons or data limitations lead to sub-optimal outcomes or these outcomes are driven by forecasters’ goals? While most studies assume that the forecasters’ loss function is symmetric, some recent studies (Bora, Katchova, and Kuethe, Reference Bora, Katchova and Kuethe2020) relax this assumption. Conditions and implications of asymmetric loss functions for public forecasters are not well accepted or understood and should be explored further.
Studies of forecast accuracy provide valuable information on the reliability of various forecasts and demonstrate that short-term forecasts tend to have much smaller errors than longer-term forecasts. While most studies conduct accuracy analysis by forecast horizon, some recent studies propose multi-horizon accuracy tests (Patton and Timmerman, Reference Patton and Timmermann2012; Quaedvlieg, Reference Quaedvlieg2021). Accuracy tests can be used to assess the accuracy of USDA forecasts relative to a benchmark of naïve or time-series forecasts. Figure 9 shows that 94% of the studies in Table 2 that investigated the accuracy of USDA corn forecasts relative to a benchmark found USDA forecasts to be more accurate or the same. Similarly, 85% (83%) of the studies found USDA’s soybean (wheat) forecasts more accurate or the same. Figure 10 shows average Mean Absolute Percentage Errors for USDA reports across subsample of studies that reported this metric.Footnote 7 This figure illustrates that the size of forecast errors appears similar across commodities, the magnitude of forecast errors declines during forecasting horizon, baseline forecast errors tend to be much larger than shorter-term marketing year forecasts.
Finding a time-series or market-based alternative to USDA forecasts has long been a goal of researchers as this would help reduce the costs of public outlook programs. However, previous studies show that USDA forecasts tend to be more accurate than naïve and time-series alternatives and have similar accuracy to market (futures)-based forecasts. Developing models and methods that can improve the accuracy of USDA forecasts is an important area for future research. Specifically, the issue of incorporating new data sources that became available due to modern technological advances, as well as applying new data analysis techniques to further improve crop production and other forecasts offers extensive opportunities for further research. On the other hand, given a similar accuracy of the market-based price forecasts, USDA should consider adopting them for commodities with active futures markets, rather than investing in alternative forecasting efforts and redirecting those resources to other uses.
Sources of information used in the forecasts also make a big difference. One of the main advantages of USDA is the access to large-scale survey data conducted by NASS. Reports that rely on survey data are sometimes referred to as situation reports, while forecasts that use non-survey data are described as outlook reports. The surveys can generate accuracy levels that have not been matched so far by alternative data collection methods such as remote sensing tools. Previous studies show that corn, soybean, and wheat production forecasts from USDA are more accurate than those from their private counterparts. Thus, maintaining this informational advantage depends on maintaining high response rates to these surveys, a goal that has been reiterated in numerous studies and taken very seriously by the USDA and various producer groups.
Evaluation of sampling techniques and survey needs to support the quality of USDA information is another interesting area for future research. For example, in a series of studies by Irwin et al., Reference Irwin, Sanders and Good2014b, Reference Irwin, Sanders and Good2014c, Reference Irwin, Sanders and Good2014d, Reference Irwin, Sanders and Good2014e) USDA’s grain stocks forecasts have been compared to industry expectations to point out that there was a notable decline in the ability of market participants to anticipate USDA stock estimates for corn since 2013. The authors argued that the potential reasons for larger surprises in corn grain stocks estimates are not due to commonly proposed reasons, but rather to unresolved sampling errors in production estimates. They demonstrated that USDA’s grain stocks estimates undoubtedly reflected sampling errors for both production and stocks estimates and it is highly likely that unresolved sampling errors for corn production estimates are large enough to explain even the largest surprises. Their analysis highlighted the potential value of adding a survey of corn feed use that would allow a fuller accounting of corn usage as well as a revision of January corn production estimates similar to what has historically been done for soybeans (Irwin, Sanders, and Good, Reference Irwin, Sanders and Good2014e).
Forecast optimality also includes a concept of rationality, or the ability to fully and efficiently include all available information. Previous studies revealed several cases where rationality was rejected because USDA appeared to repeat past errors or failed to fully incorporate macroeconomic information. By far the most common form of inefficiency observed in USDA reports is the correlation in forecast revisions. The findings of these studies should be used by USDA to improve their forecasting procedures and may also be used by forecast users to better understand USDA forecasts’ limitations. However, the interpretation of these findings is not always straightforward. For example, Goyal and Adjemian (Reference Goyal, M.K., Glauber and Meyer2023) argued that correlated revisions may be explained by information rigidities rather than strategic smoothing on behalf of forecast providers. As forecast inefficiencies and the ways to correct them offer opportunities to improve forecast accuracy, they will continue to represent an important area of research.
In addition to traditional metrics of bias, accuracy, and efficiency, additional aspects related to informativeness, uncertainty, and joint systems emerged in the literature. These aspects help us evaluate and interpret USDA forecasts from different angles. These approaches provide useful information for decision-making and help answer the following questions. Up to what horizon are the forecasts informative and when do they lose their value? Do these forecasts reduce market forecast variance and help us get closer to the equilibrium? How uncertain are these forecasts at different horizons? Are errors in some forecasts cause errors elsewhere in the forecast system? Is the forecast system consistent enough with the system of outcomes to be used for decision-making? All these additional questions enrich our understanding of USDA forecasts and should be explored further. Forecasting is an important task used in various fields. Researchers can use lessons learned from other fields to improve agricultural forecasting.
Overall, our study revealed that even though there is a lot of evidence that USDA information is accurate and reliable, there is also room for improvement. The emergence of new methods recently developed for the evaluation of forecast accuracy and efficiency offers significant opportunities for gaining new insights. Better access to data and technology allows for continuous updating and improvement of forecasting approaches to provide the best available information to the public. In this environment, USDA needs to be open and transparent with their data and methods. For example, access to the archive of historical forecasts and estimates allowed for recent research on farm income and baseline forecasts. With better access to data and methods, researchers will be able to offer more insights into forecasts’ characteristics, their strengths and weaknesses, and potential areas for improvement.