1 Introduction
Home energy audits have been offered in the United States since the 1970s, and the use of the audits has expanded with the availability of stimulus funds in recent years (Ingle et al., Reference Ingle, Moezzi, Lutzenhiser, Hathaway, Lutzenhiser, Joe, Peters, Smith, Heslam and Diamond2012). In California, home energy efficiency survey (HEES) programs are implemented statewide by public utilities. The programs’ objectives are to increase awareness, inform customers about their consumption behavior, and make other resources available to reduce energy consumption. When customers complete a survey questionnaire they receive extensive personalized feedback and tips about what actions they can take to save energy and money. The surveys inform both the implementer and the consumer how energy has been used in a house. Because there is imperfect information regarding a household’s inattention and usage behavior, personalized feedback can lead to the desired behavioral change. We present evidence that customers who participated in the survey reduced their electricity consumption by about 7% on average compared to customers who had not yet participated in the survey, and as the quantiles of the outcome distribution increase, the effect of the program decreases.
As discussed in earlier studies, consumers may behave inefficiently because of the unclear relationship between price and electricity use. Home energy efficiency audits can close the information gap via personalized feedback serving as a reminder. By providing additional tailored information, personalized feedback may also decrease information asymmetry and result in more efficient and persistent behavior change by lowering the cognitive cost of energy decision-making (Gillingham et al., Reference Gillingham, Newell and Palmer2009). Numerous conservation studies have been designed using varying informational and behavioral strategies to address to the information gap (Abrahamse et al., Reference Abrahamse, Steg, Vlek and Rothengatter2005; Delmas et al., Reference Delmas, Fischlein and Asensio2013). Utility companies have been one of the major implementers of home energy audit programs. Under regulatory practices utilities have an incentive to invest in conservation measures, but they may limit actual conservation through the improper design of a program (Wirl & Orasch, Reference Wirl and Orasch1998).Footnote 2 So, although we may encounter some successes the ways that home energy audits have been designed and executed have often been ineffective.
In California, utilities also have not used research methods (RCT design) to build and implement HEES programs. Thus, this and similar types of self-selected participant studies did not lead to ground-breaking policy changes or the behavioral interventions needed to change consumer behavior. Recently, there have been some signs of the implementation of scientific approaches in energy efficiency program designs. In California the Public Utilities Commission (CPUC) made it mandatory for all statewide Investor-Owned Utilities (IOUs) to implement behavior-based programs. An example of such a behavior-based program is the implementation of social comparisons by the research company OPOWER (Allcott, Reference Allcott2011) or multi-family complex competition by the Southern California Edison (SCE) Company (Chen et al., Reference Chen, Rustamov, Hirsch, Lau, Buendia and Ayuyao2015).
Much of the empirical microeconomic literature in the economic development area uses econometric and statistical methods to overcome the deficiencies of non-experimental data (Deaton, Reference Deaton2000). Because of the inherent self-selection in the survey we study, we begin by employing the empirical technique of Sianesi (Reference Sianesi2004), who examines the effectiveness of unemployment programs in Sweden. She suggests selecting future program participants for matching estimations. We apply the method in a different market setting, residential energy efficiency audits. The DID estimator provides evidence that participation in the survey leads to about 7% less electricity consumption by survey participants on average compared to customers who did not participate. In addition, the effect is persistent over time, at least for the year after the survey.
Our objective here is to propose an alternative method when evaluating HEES programs by selecting future participants as the control group as suggested by Sianesi (Reference Sianesi2004). The approach is different from the current practice of evaluating HEES and similar programs. For instance, 2006–2008 HEES impact evaluation was based on the participant information only, whereas the 2010–2012, and 2015 studies were based on using non-participants to match and select as the control group to estimate the treatment effects.Footnote 3
Utilities use various delivery mechanisms to implement home energy audit programs – through mail, online, telephone and in-home (on-site) audits. Here we also investigate the differential performance of the mail-in versus online versions of the home energy surveys in addition to the combined survey impact.Footnote 4
Finally, there is recent concern as to whether nudge-based and other household energy conservation programs are regressive and whether onerous requirements are imposed on less well-off households (Gayer & Viscusi, Reference Gayer and Viscusi2013; Levinson, Reference Levinson2016). We employ quantile regression techniques to detect the distributional usage effects of the home energy surveys. Households in the lowest quantiles have more substantial response to non-binding energy conservation efforts, in percentage reduction of electricity consumption, than consumers in the median or highest quantiles. The importance of our quantile analysis is in showing that the estimated survey effects differ by the level of pre-survey household consumption.
2 Data
The data that an IOU in California provided to us are on a confidential basis. The information covers more than 4200 customers who voluntarily participated in the HEES in January of 2009 and 2010. We eliminated households with less than 12 months of consumption data during the period, leaving a total of $N=4173$ households.
Because households opt-in to the HEES program we first chose the January 2009 survey participants as the treatment group and the future survey participants, those from January 2010, as the comparison group. The comparison group contains customers who did not participate in January 2009 and have not yet participated in the survey (Sianesi, Reference Sianesi2004). We use same monthly usage and billing interval, 2008 and 2009, for treatment and comparison groups. The summary statistics for our data appear in Table 1.Footnote 5 The data set we use here is the result of combining three main sources that reflect monthly energy consumption: billing, dwelling demographics, temperature and the survey (HEES). The billing data cover 2008 and 2009 for both the 2009 and 2010 survey participants. The weather information comes from the monthly Cooling Degree-Days (CDD) data over the billing period from 2008 to 2009, which we merged with the main dataset. Because California has warmer weather than the national average, we used a $72\,^{\circ }\text{F}$ indoor baseline temperature instead of the nationally defined baseline of $65\,^{\circ }\text{F}$ .
Note: Standard deviations are in parentheses. Percentages are rounded. 97.5% of the households in the data have 24 months of observation, 2.5% varies between 15 and 23 months.
The HEES program provides residential customers with an energy audit of their homes through a mail-in, online, telephone, or in-home (on-site) energy survey. The survey instrument asks the participants a series of questions about their homes and then offers a list of tips based on their responses. Subsequent recommendations include both possible changes in behavior and information on more energy-efficient appliances. The program is meant to incite action; its purpose is to inform the participants of opportunities to save money and to provide resources to implement the recommendations.
It is important to determine whether the design of the HEES report is successfully imparting useful knowledge, referring participants to helpful resources, and whether the coordination effort is motivating participants to adopt more energy- and water-efficient behaviors. As noted earlier we focus on mail-in and online survey participant data. The two survey methods are commonly compared with other methods. Furthermore, telephone and in-home surveys are being used less frequently by utilities and have not been the most preferred choices of participation by the customers.Footnote 6 In-home data are also costly for utilities to collect, although the largest savings are observed as a result of an in-home survey-based intervention (ECONorthwest, 2009; Itron Inc., 2013).
The literature presents evidence of low take-up rates to opt-in energy efficiency programs and home energy reports (HERs) (Fowlie et al., Reference Fowlie, Greenstone and Wolfram2015; Allcott & Kessler, Reference Allcott and Kessler2018). Throughout California IOUs have used various targeting methods to get customers to complete the energy efficiency surveys. The marketing process often targeted households with high bills and therefore likely to achieve higher savings, which is particularly true for mail-in participants in our sample (Itron, 2013).Footnote 7 To encourage households to complete the surveys, IOUs provided incentives, such as gift cards, to participate in the survey program (Itron, 2013). Online surveys also marketed with email blasts and through utility websites, but online surveys were still available to all households through the IOU website.Footnote 8
Figure 1 describes the difference among the treatment, comparison, and randomly selected households in terms of their monthly average electricity usage. It highlights the mean energy usage differences (kWh) by income groups during the pre-survey period. Randomly selected households, who never participated the HEES program, total about 10,000 residential utility households from the same utility company.Footnote 9 We show means for each income groups by treatment versus comparison sub-groups. Randomly selected non-participant households consumed substantially less energy than households in both the treatment and comparison groups, who participated the survey in the following year (2010). Overall, HEES participants who opted into the program had a higher average usage than non-participants.
3 Methods
Because the audit program uses online-based and mail-delivery mechanisms (formats) to reach customers, we first evaluate the average impact of each format separately on post-audit energy consumption behavior. Here the treatment group is the January 2009 program participants, and the comparison group is the January 2010 program participants. To address the self-selection issue we first identify a valid comparison group. We chose the January 2010 program participants (future survey participants) as the comparison group so that the classical treatment and control distinction holds (Sianesi, Reference Sianesi2004). Our framework then determines the proper and valid matching estimation. The approach we use is more reliable (Sianesi, Reference Sianesi2004, Reference Sianesi2008) than matching persons who have never participated in home energy audits (Du et al., Reference Du, Hanna, Shelton and Buege2014; Itron, 2013). The HEES program evaluation study prepared by Itron, Inc. (2013) also presents the impact of the survey by employing a matching method where the comparison group also is non-participants.Footnote 10
Another common practice in evaluating home energy audits has been engineering-based ex ante analysis, which has led to systematically biased and exaggerated energy savings estimates and significant overestimates of persistent energy saving (Nadel & Keating, Reference Nadel and Keating1991; Dubin et al., Reference Dubin, Miedema and Chandran1986; Davis et al., Reference Davis, Fuchs and Gertler2014; Gerarden et al., Reference Gerarden, Newell and Stavins2017). In particular, “There may have been a selection bias whereby researchers have chosen to evaluate engineering-economic analysis that have most exaggerated the savings potential of efficiency investments” (Gerarden et al., Reference Gerarden, Newell and Stavins2017).
3.1 Addressing self-selection in opt-in programs
Randomized experiments create independence between the treatment application and consumer characteristics, both observed and unobserved. Non-randomized observational data can be misleading because of self-selection – decisions made here by households to participate in the energy efficiency survey. The main concerns are unmeasured factors, such as motivation to take action, which may affect the decision to participate in the survey along with post-intervention behavior. A customer who has requested an audit may be from the type of household taking other unobserved actions to conserve energy (Allcott & Mullainathan, Reference Allcott and Mullainathan2010).
The confounding difference between survey participants and non-participants underscores the difficulty of controlling for interpersonal differences when estimating the causal effects of programs. The main problem here is that often the researcher wishes to draw conclusions about the wider population, not just the sub-population from which the data come (Kennedy, Reference Kennedy2003). However, because of ethical problems, the large costs of implementing randomizations, and problems with external validity, many studies use observational data instead of implementing a randomized experiment (Fu et al., Reference Fu, Dow and Liu2007; Black, Reference Black1996).
Similar to many other energy efficiency survey programs the HEES audit program we evaluate here acknowledges that the customer chooses to participate in the survey instead of having been randomly assigned by the program designer. Because people self-select into the program, it complicates identifying what the response will be if the program were implemented on a mandatory basis or through some added participation incentive payment. However, if the research question is simply how do voluntary participants in the programs respond then there is no confounding self-selection issue. Although it seems that households opt-in to the program, IOUs targeted the high-energy users through mailings, post-cards, the IOU website, email blasts, incentives, and various other ways to induce high-usage households to join and complete the survey. So, in our sample the customers, who are high-energy users, were particularly targeted and tagged to be part of the program.Footnote 11 This means that the HEES program is similar to the Weatherization Assistance Program (WAP) that Fowlie et al. (Reference Fowlie, Greenstone and Wolfram2015) studied, where the program provides free energy efficiency improvements to low-income households.Footnote 12 The design of the HEES audit program also differs from the solar photovoltaic (PV) programs in California where the rate structure and cost of the PV installation, regardless of the tax incentive and rebates, has tilted more affluent households toward participating (Borenstein, Reference Borenstein2017).
Because we want our empirical results to be informative on the issue of mandatory implementation, we also consider econometric solutions to the problem of self-selected data. To provide a proper estimate of the treatment effect with observational data we employ the method suggested by Sianesi (Reference Sianesi2004), where the comparison group is customers who were not yet participating in the survey but participated later. The samples in both treatment and comparison groups received the same type of encouragement or targeted marketing, but at different times. Our empirical approach suggests a method that could reduce possible inflated program effects estimates and provides a credible approach in assessing the underlying causal hypothesis. Because an experimental approach was not feasible for the type of survey used by the institution, we propose a credible empirical method and comparison group. Here we are not only using a selection on observables approach, but we are also using future participants as a comparison group, which addresses the unobservable characteristics issue. So, we are not only suggesting a method that could address possible self-selection problems in evaluating energy-saving programs, but we are also proposing a method that is credible in measuring the effectiveness of energy-saving programs in targeted sample settings. Instead of using randomly selected utility customers as a comparison group and matching them with the treatment group based on observable pre-survey characteristics, we use customers who joined the program later, in January 2010.Footnote 13
3.2 Evaluation approach
Using the mean outcome of untreated individuals $E[Y_{0}|T=0]$ in non-experimental studies is usually not a good idea because components that determine the treatment decision may also determine the outcome variable of interest (Caliendo & Kopeinig, Reference Caliendo and Kopeinig2005). This suggests that even if the researcher chooses the best possible candidate for the comparison group their consumption levels will still be different if consumers do not participate in the surveys because of the unobserved counterfactual. We therefore begin by estimating the effect without matching for comparison with the other models. Then we estimate the program effect using the matching methods. To validate the matching procedure for empirical content and external validity, it is important that the following conditions hold: the conditional independence assumption (CIA) and common support (CS). The CIA suggests that given a set of observable characteristics, the distribution of $Y_{t}^{0}$ for customers who participate in the survey in January 2009 is the same as the (observed) distribution of $Y_{t}^{0}$ for customers who wait until January 2010 to participate (Sianesi, Reference Sianesi2004):
Because we chose a comparison group from the future participants, equation (1) postulates that conditional on $X$ , there is no unobservable heterogeneity left that affects both survey participation and later consumption (Sianesi, Reference Sianesi2004, Reference Sianesi2008; Caliendo & Kopeinig, Reference Caliendo and Kopeinig2005), which suggests that the probability distributions of the two groups are similar to each other.Footnote 14
Another requirement for the matching methods procedure is the CS or overlap condition:
“This condition guarantees that persons with the same $X$ values have a positive probability of being both participants and non-participants” (Heckman et al., Reference Heckman, LaLonde, Smith, Ashenfelter and Card1999). The CS condition means that for every customer in the treatment group there are customers with similar characteristics in the comparison group. Heckman et al. (Reference Heckman, LaLonde, Smith, Ashenfelter and Card1999) show that the CS condition is central to the validity of matching. Considering the conditional independence and CS conditions, the literature suggests that the propensity score is useful in constructing matching estimators. The propensity score is the conditional probability of being treated at time $t$ given a vector of observed characteristics, which reduces the dimensionality of the matching problem (Rosenbaum & Rubin, Reference Rosenbaum and Rubin1983). The propensity score here estimates the propensity of the customers with a set of observed characteristics to receive the program – the energy efficiency survey.Footnote 15 Thus, the customers who have the same or similar propensity-score values have similar distributions of all of the observable characteristics.
Figure 2 shows that the customers in the treatment and comparison groups have similar propensity-score distributions. According to Dehejia and Wahba (Reference Dehejia and Wahba2002), propensity-score-matching estimates are more consistent with estimates that are derived from an experimental design. However, propensity-score matching does not guarantee that all of the individuals in the non-treatment group will be matched with individuals in the treatment group (Titus, Reference Titus2007).
Once estimated, the propensity score can be used in a variety of analytic approaches, such as matching and weighting. The literature identifies several ways of matching each survey participant to a non-participant (Rosenbaum & Rubin, Reference Rosenbaum and Rubin1983,Reference Rosenbaum and Rubin1985; Rubin & Thomas, Reference Rubin and Thomas1992; Baser, Reference Baser2006; Hansen, Reference Hansen2004; Smith, Reference Smith1997). We use kernel propensity-score-matching methods to calculate the difference-in-differences estimator. Kernel matching is a non-parametric estimator that uses weighted averages of all persons in the comparison group to construct the counterfactual outcome (Caliendo & Kopeinig, Reference Caliendo and Kopeinig2005). The kernel-based weight declines with the distance between the individuals in the two groups. No specific matching estimator is appropriate by itself. We performed kernel-based propensity-score matching because of the large sample size and feasibility.
We then introduce non-parametric versions of the difference-in-differences (DID) estimation with the later participants as a comparison group using the kernel-based propensity-score-matching method (Meyer, Reference Meyer1995; Heckman et al., Reference Heckman, Ichimura, Smith and Todd1998; Sianesi, Reference Sianesi2004, Reference Sianesi2008; Allcott, Reference Allcott2011). Allcott (Reference Allcott2011) suggests forming a comparison group according to the average monthly energy use of households. The benefit of the standard DID model is that it provides the average effect of the intervention on the treatment. Furthermore, because of the self-selection in the sample, we adopt a difference-in-differences matching estimator to control for the presence of the unobservable characteristics, as referenced in List et al. (Reference List, Millimet, Fredriksson and McHone2003). Finally, Heckman et al. (Reference Heckman, Ichimura, Smith and Todd1998) and Blundell and Costa Dias (Reference Blundell and Costa Dias2009) note that propensity-score DID accounts for both observed and unobserved time-invariant differences between the treatment and the comparison groups, which mitigates bias.
The design of our DID model is as follows. Individual $i$ belongs to either the treatment or the comparison group, $T_{i}\in \{0,1\}$ , where $T=1$ is the treatment group and is observed in $t$ periods, where $t$ indexes to periods 1 and 2. The period of $i$ ’s consumption behavior is defined as $P_{t}\in \{0,1\}$ , before and after treatment periods. $Y_{i}$ is the outcome variable – monthly energy consumption in ln(kWh) and in kWh. The interaction term $T_{i}\cdot P_{t}$ is an indicator of the treatment. The standard DID model for the realized outcome is then
The coefficient of the interaction term, $\unicode[STIX]{x1D713}$ , is the DID effect, or the impact of survey participation on later consumption behavior. $X$ is a vector of household demographics, dwelling characteristics and responses to the survey questionnaire. The DID is the difference in the average outcome in the treatment group before and after the treatment minus the difference in the average outcome in the comparison group before and after the treatment (Athey & Imbens, Reference Athey and Imbens2006). Following equation shows the standard DID estimand, $\unicode[STIX]{x1D713}$ .
Smith and Todd (2001), who examine whether social programs can be reliably evaluated without using randomized experiments, conclude that DID matching estimators generally exhibit better overall performance. Considering that our study has access to the pre- and post-treatment residential energy consumption data, DID with propensity-score-matching approach is suitable for our research.
Another type of non-parametric approach that we apply is the quantile DID (QDID) matching method. We continue using kernel-based propensity-score matching. The focus for the basic DID method is to produce the average causal effects of program participation. However, we are also interested in investigating the effect of the programs on the entire distribution of outcomes. “The distribution of the dependent variable may change in many ways that are not revealed or are only incompletely revealed by an examination of averages” (Frölich & Melly, Reference Frölich and Melly2010). Because our dependent variable is continuous – monthly energy consumption – it makes sense to test the effect on the distribution by identifying the relative savers and losers (Angrist & Pischke, Reference Angrist and Pischke2009). The primary observable source of heterogeneity is as a function of pre-treatment usage (Allcott, Reference Allcott2011). It is possible that households in the lower quantiles respond to the survey differently than households in the upper quantiles. Quantile regression reduces the importance of outliers and functional-form assumptions and allows us to examine features of the distribution besides the mean (Meyer et al., Reference Meyer, Viscusi and Durbin1995).
Here the survey may have different effects in different quantiles, so that we apply DID to each quantile rather than to the mean to investigate features of the distribution (Meyer et al., Reference Meyer, Viscusi and Durbin1995; Athey & Imbens, Reference Athey and Imbens2006). The QDID estimates we present are for both the extreme (0.1 and 0.9) and central (0.25, 0.5, 0.75) quantiles. The QDID estimator on quantile $q$ can be written as
where $F_{Y}^{-1}(q|X)$ is the distribution function for $Y$ at $q$ , which is conditional on $X$ (the matched observable characteristics or propensity scores) (Athey & Imbens, Reference Athey and Imbens2006). Equation (5) shows the difference between treatment and comparison groups before and after the treatment for different quantiles. To our knowledge, our study is one of the earliest attempts to apply the QDID matching method to residential energy efficiency program evaluation.
We use the natural logarithmic transformation, Ln(kWh), where the interpretation of the effect is in terms of proportionate changes.Footnote 16 We show changes in kWh usage as well. Finally, to identify the durability of the intervention we estimate both short-term (quarterly) and longer-term (year) effects of energy efficiency survey participation.
4 Results
Participation in the energy audit program is voluntary. If non-participants are used as a comparison group, systematic energy use differences would be apparent between the participant and non-participant groups because of unobservable motivation and observable household characteristics.Footnote 17 In contrast, the Itron (2013) report for CPUC employed propensity-score matching with non-participants and matched with too many observable characteristics. As a result, almost 90% of the samples in the comparison group were dropped during the matching process. We instead begin by focusing on identifying and justifying the valid comparison group and then continue with the regression estimation. The objective is to prevent an inflated estimate of the audit program’s potential impact. The interest in calculating the propensity score and matching methods “purely lies in their combined ability to balance the characteristics of the matched sub-groups being pair-wisely compared” (Sianesi, Reference Sianesi2008).
We estimate the outcome of interest, post-audit behavior, by employing two non-parametric estimation techniques. We begin with kernel propensity-score-matching DID, which produces average treatment effects. We also investigate the impact of an audit on the entire distribution by employing a QDID approach. Although we focus on overall survey participation, we also report the results separately for web-based and mail-in program participants and the impact on consumption over time. Our results suggest that there is a significant reduction in consumption overall with audit participation. Web-based survey participants show much greater reactions to their surveys than mail-in participants do (11% vs. 4%).Footnote 18 To test the durability of the intervention, we also estimate short-term (quarterly) and longer-term (year) effects. Because we use DID and QDID, seasonality should not be a concern. However, for an additional robustness check, we calculated the estimators in both scenarios – seasonally adjusted and unadjusted regressions – and there is only an incremental difference between the two estimators. The details of the additional analyses and discussions appear in the following sub-sections.
4.1 Graphical results and balance diagnostics
Figure 1 shows the mean energy usages (kWh) by each of the income groups during the pre-survey period. The matching procedure was effective in creating a group of customers who were comparable with the treatment group based on observable confounders. So, first we estimate the probability of participating in the survey given the values of potential confounders (the propensity score) for each customer in the data. Next, we graphically display the distribution of propensity scores of the treatment and control groups (Figures 2a and 2b) for all cases – overall, online and mail-based delivery mechanisms. The graphs show that the distributions of the propensity scores significantly overlap. A visual examination of the before-matching distribution also allows checking of the region of CS. In each graph there is sufficient overlap between the treatment and control groups, which suggests that one can make reasonable comparisons. Then we match individuals in the treatment group with individuals in the comparison group based on the kernel-based propensity scores. Figures 2a and 2b compare the propensity-score distribution of the treatment and comparison groups before and after matching. The density plot graph shows that the propensity scores have similar trends, and the graph reveals an extensive overlap of the distributions.
Next, we check the balance diagnostics (Table 2). “In the context of propensity-score matching, balance diagnostics enable applied researchers to assess whether the propensity-score model has been adequately specified” (Austin, Reference Austin2009). Table 2 reports both the bias and the mean differences between the treatment and comparison groups in the matched sample. The matched groups’ balance is off by only a small amount, where the value of the standardized bias for overall HEES participation is 0.5%, which is less than the unmatched maximum of about 59%. Moreover, the differences between the groups became statistically insignificant during the post-matching period ( $t=0.39$ ).
Note: Propensity scores are estimated conditional on pre-treatment (survey) observable characteristics.
Table 2 also shows the assessments of online and mail-in survey participation. The pre- and post-matching trends for the overall survey and the online survey are close to each other. The standardized bias for online participants is 0.3%, which is also less than the unmatched maximum of about 14%, and which suggests that (even before any matching) the group of online participants was more similar than were the general survey and mail-in participants. In both the overall and online scenarios the propensity score is balanced in the matched sample. In contrast, the pre- and post-matching differences are significant for the mail-in audit participants, and there is a significant reduction in percentage bias: the pre-matching bias was reduced from about 36% to about 6%. Studies suggest that the standardized bias should be less than 5% to 10% (Rosenbaum & Rubin, Reference Rosenbaum and Rubin1985; Austin, Reference Austin2009). In addition, the sample size in the data influences the $t$ -test (Austin, Reference Austin2009). For mail-in participants, the number of future participants is much greater than the treatment group, so one should not place undo emphasis on the $t$ -test versus on the standardized percent bias.
4.2 Estimation results
We now examine various measures over a two-year period to investigate how customers who participated in the energy efficiency audits performed, on average (individual and distribution), compared to customers who waited one year to participate. We begin by presenting the standard DID estimates where the comparison group is not matched based on the kernel propensity-score matching.
Table 3a summarizes the DID estimates, where the outcome is the natural log of electricity consumption. Columns 1–3 show the DID results without matching, and columns 4–6 show the propensity-score estimates. The significance of the coefficients, the small differences among the coefficients (approximately one percent), and the standard errors between the matched and unmatched estimations further verify the validity of the comparison group. Table 3b depicts the same evidence where the dependent variable is kWh consumption. The results in Tables 3a and 3b suggest that one year after an energy audit program participation the customers who participated in the survey in January 2009 reduced their electricity consumption by about 7%, or 76 kWh on average, compared to households that did not participate in the survey until January 2010. Our mean results are consistent with the meta-study of informational conservation experiments, which finds a weighted average effect of about a 7% electricity reduction (Delmas et al., Reference Delmas, Fischlein and Asensio2013).
Note: Standard errors are in parentheses. Estimations are adjusted for seasonality using seasonal dummies. Ln (kWh): Log Consumption (kWh). A – aggregate, M – Mail, O – Online. *** $p<0.01$ , ** $p<0.05$ , * $p<0.1$ .
Note: Standard errors are in parentheses. Estimations are adjusted for seasonality using seasonal dummies.
A – aggregate/combined, M – Mail, O – Online. Matching is based on the kernel-based propensity score.
*** $p<0.01$ , ** $p<0.05$ , * $p<0.1$ .
The different performance of online survey participation compared to mail-in survey participation is also important. Tables 3a and 3b show that, on average, one year after HEES participation the online HEES participants reduced their electricity consumption more than the mail-in participants, 11% vs. 4% (112 kWh vs. 52 kWh). Du et al. (Reference Du, Hanna, Shelton and Buege2014) also report a similar differential effect between online and mail-in HEES participants. In particular, they investigated the probability of future energy efficiency program participation as a function of current HEES participation.Footnote 19 They conclude that the delivery mechanism of the survey matters for post-intervention behavior. Thus, the households who participated in the online survey increased the probability of future energy efficiency program participation by 3% to 4% compared to under 3% for mail-in survey participation.Footnote 20 This suggests that utilities and program designers could achieve greater behavioral responses in terms of reducing electricity consumption or participating in different behavioral programs in the future by promoting online survey mechanisms, which is also the least costly approach.
Table 4a depicts the average treatment effect on later consumption behavior over time. It is important to examine and distinguish the effects of short-term versus longer-term behavior. The frequency of the outcomes we investigate is quarterly. As discussed earlier, the HEES provides personalized feedback and energy conservation information. The survey does not provide repeated interaction, as in other HERs such as Opower energy reports. Thus, we are also interested in how customers respond to HEES audit programs in the months or year after the surveys.
Table 4a shows that households did not immediately respond to the non-binding personalized feedback. The average treatment effects increase gradually as time passes.Footnote 21 There is no effect after the first three months. The effect after six months is about $-3\%$ , and it is approximately $-6\%$ nine months later. There is a 7% reduction in electricity consumption one year later compared to households that have not yet joined the program. One year later the treatment behavior does not attenuate, but instead habitual behavior changes. However, there are diminishing returns as time passes. If we evaluate our conclusions together with the results from Du et al. (Reference Du, Hanna, Shelton and Buege2014), the contrasting results from Allcott and Rogers (Reference Allcott and Rogers2014) are not surprising. Du et al. (Reference Du, Hanna, Shelton and Buege2014) compare the probability of participating in future efficiency programs at six and 12 months and find results of about $-4\%$ versus $-6\%$ . Households may engage in other energy efficiency programs and are also more likely to reduce their electricity consumption.
Electricity prices are not salient (Shin, Reference Shin1985; Sallee, Reference Sallee2014). The non-saliency makes incentives ineffective for consumers to change their electricity consumption behavior. Utility consumers in the United States only think about their electricity consumption nine minutes per year, and their attention and interaction increase when they receive a high bill (Accenture, 2012). The results in Table 4a suggest that receiving higher bills or other intrinsic motivations could cause consumers to pay more attention and curb their incentives by participating in energy efficiency surveys, which could lead to more effective habitual behavioral changes than those of consumers who have not yet participated in the survey.
Note: For Tables 4a, 4b and 4c, standard errors are in parentheses. Estimations are adjusted for seasonality using seasonal dummies. Time in quarters, from survey participation. Model 1 – effect on 1st quarter, Model 2 – 2 quarters, Model 3 – 3 quarters, and Model 4 – after the entire period (year). *** $p<0.01$ , ** $p<0.05$ , * $p<0.1$ .
Tables 4b and 4c present the differential performances of mail-in and online survey participants over time. In Table 4b, which shows the effect for mail-in survey participants, there are immediate reactions to the surveys after the first three months. In the following quarters, there are about 2%, 5%, and 4% reductions in electricity consumption. The disaggregated reactions decrease at a decreasing level. As shown in Table 4c, online survey participants reduced their consumption by about 7%, 10%, and 11% over time. It could be even more interesting if longer-range consumption data were available.
Thus far we have discussed the average treatment effect of program participation and have described the average effect of a survey on the typical utility customer. However, because the dependent variable has a continuous distribution, averages may not properly reveal the changes in the distributions (Angrist & Pischke, Reference Angrist and Pischke2009). Despite the significance of the average effect, we must evaluate whether the magnitude of the effect is persistent and constant for different quantiles. This will show how households at different quantiles may react differently to personalized feedback. We employed QDID by using kernel-based propensity-score-matching estimation, which is an informative framework for examining how the quantiles of energy consumption change in response to survey participation.
Note: Standard errors are in parentheses. Estimations are adjusted for seasonality using seasonal dummies. We also tested the equality of coefficients, and the differences between coefficients are statistically significant.
*** $p<0.01$ , ** $p<0.05$ , * $p<0.1$ .
Tables 5a and 5b display the QDID estimators and the effects of survey participation at both the central (0.25, 0.5, 0.75) and extreme (0.1 and 0.9) quantiles. Table 5a provides results in terms of percentage changes in the outcome, and Table 5b, as supplemental, provides results for absolute changes in response to the survey participation. Our discussion primarily focuses on percentages or proportional changes. The estimates show significant effects of audit participation compared to households that have not yet participated in the survey. The estimated marginal effects of each quantile regression differ; the estimated marginal effect decreases the farther away one is from the lowest quantile (see Figure 3). Households in the lowest quantile save approximately 8% one year after HEES participation, whereas the savings are 3% at the 90th percentile (Column 1, Table 5a). Columns 2 and 3 show the differential performance of the delivery mechanisms. At the extreme quantile of 0.9, there is no evidence of a treatment effect for the mail-in participants. The first three columns of Table 5a also reveal that the lower consumption quantiles saved much more than the upper quantiles among the survey participants versus the comparison group. The changes in the marginal effects for the online audit participants are lower than for the mail-in participants.
Note: Standard errors are in parentheses. Estimations are adjusted for seasonality using seasonal dummies.
*** $p<0.01$ , ** $p<0.05$ , * $p<0.1$ .
The results in Table 5a have additional important implications for policy makers and program designers than simply considering the average effect. Consumers who are in the lowest quantiles are inclined to have more substantial reactions to non-binding energy conservation than consumers in the median or highest quantiles. The critical result of our quantile regressions is showing that the estimated survey effects differ by the level of pre-survey household consumption.Footnote 22
Our final set of results indicate that analyzing distributional impacts rather than just the average effect can provide better understanding about the program effects, and are less limiting in terms of its implications. The pattern of distributional impacts of the energy efficiency program can be a powerful tool to help to assign more targeted and salient interventions in maximizing the program impact and reducing the cost of the implementation of such programs. There is also a discussion in the literature that energy efficiency programs and standards have regressive implications (Fullerton, Reference Fullerton2008; Levinson, Reference Levinson2016). Our results show that low-usage households – among the survey participants – save proportionally more energy than do high-use customers. This suggests that once households opt into home energy efficiency programs, such as HEES, there is no evidence of distributionally regressive implications for electricity use here.
5 Concluding remarks and policy implications
Energy efficiency plays a critical role in energy policy debates because meeting our future needs boils down to only two options: increasing supply and decreasing the demand for energy (Gillingham et al., Reference Gillingham, Newell and Palmer2006). Due to the high up-front cost of constructing of large renewable energy facilities, transmission lines and uncertainty in federal or state level support, end-use programs could lessen the pressure by reducing the demand (Considine & Manderson, Reference Considine and Manderson2014). Information provision and salience has been documented to affect consumer decisions and decisions to invest in energy-efficient technologies (Newell & Siikamäki, Reference Newell and Siikamäki2014). In our research we examine one of the statewide programs in California (HEES) and determine how well the program has worked in terms of saving energy.
Objective of our study is to provide an alternative measurement approach in evaluating energy efficiency programs. In implementing the method suggested by Sianesi (Reference Sianesi2004, Reference Sianesi2008), we determined the adequate comparison group to correct for the self-selection in non-experimental energy efficiency program evaluations. We then employed a diagnostic test similar to the method suggested by Austin (Reference Austin2009) after matching with estimated kernel-based propensity scores. Combining the two regression estimators, we then described ways to address the systematic differences between the treated and comparison individuals in investigating the effects of residential energy efficiency surveys. Our research is unique in applying the combined methods in evaluating residential energy efficiency programs.
Although the impact was heterogeneous, we provide evidence that the customers who participated in the survey reduced their electricity consumption by about 7%, or about 76 kWh/month on average (see Table 3b). Here, we present a simple calculation of the realized monetary savings for the 2009 HEES participants. If we scale the savings to the entire 2009 HEES participants (January through December) the total reduction in energy consumption would be about 2 million kWh per month, an amount equal to the typical monthly consumption of approximately 3500 households in California. Using a carbon price of $21 per metric ton of carbon dioxide (EPA, 2015; Greenstone et al., Reference Greenstone, Kopits and Wolverton2013), the electricity savings resulted in a monthly estimated reduced emission of 1527 metric tons of carbon dioxide, which is a social cost reduction of about $32,000 per month.Footnote 23
Additionally, we evaluated benefits and costs of the program by using per unit survey cost of the program, as we did not have access to the total cost of the program for 2009, which also included administrative, implementation, measurement, evaluation and other program related costs. So, per unit cost of the mail-in survey was about $12 for SCE’s HEES program and let us assume that all the customers used only mail-in surveys, which are more expensive than online surveys. Aggregate (per unit) cost would have been about $324,000, then. On the other hand, using 2011 average residential electricity rates of California (EIA, 2011), which was about 15 cents/kWh, total reduction in monthly bills would have been about $308,000/month. Thus, participating in the voluntary energy efficiency program we examined would save about $12 a month per customer. The net reductions could be higher if customers moved from higher tier rates to lower tiers as a result of reductions in their energy usage. However, since the sample is not randomly selected, it is not possible to infer any tier movement savings to the IOUs’ entire high-usage customer base.
The effects of the two versions of the survey both show significant energy-saving effects of 11% and 4% for online and mail-in participants. An implication is that how the program is delivered matters as much as having a program. Du et al. (Reference Du, Hanna, Shelton and Buege2014) report similar findings. In addition, our results suggest program effects that become significant and increase in magnitude gradually over time but at a decreasing increment.
Electricity prices are not salient, which already creates a weak incentive to change behavior and routines. To produce more persistent effects customers should be reminded of the intervention because the effects decay. Harding & Hsiaw (Reference Harding and Hsiaw2014) suggest that some households may actually view energy efficiency surveys as a commitment device. It is therefore necessary to have additional interactions with households. Because the persistence of treatment also has a spillover effect for the year after the intervention and leads customers to other energy efficiency programs, an assessment of cost-effectiveness should include them too (Allcott & Rogers, Reference Allcott and Rogers2014).
In addition, because of the heterogeneity in pre-treatment energy consumption, we examined the QDID estimator. Our results suggest that as the quantiles of the distributions increase the effect of the program on electricity consumption decreases in terms of proportionate usage (see Figure 3). Households at the lower quantiles save proportionally more electricity than do customers at higher quantiles.Footnote 24 Better customer targeting based on the usage distributions would create significant savings, which would also improve the efficiency of the programs and may also address equity concerns.Footnote 25 This suggests that once households opt into HEES, we have no evidence of the program burdening low-use and low-income more than the households in the higher quantile of electricity consumption. Our results imply that program designers can better target low-use and low-income households, because they are more likely to benefit from such programs through savings. Overall, program participants on average use more electricity than the non-participants (see Figure 1).
We show that understanding of distributional effects can be crucial during the decision of implementing energy efficiency programs and extracting savings cost effectively. Furthermore, better-targeted information can elicit biased beliefs, present bias, inattention, and other decision biases (Allcott, Reference Allcott2016; Allcott & Taubinsky, Reference Allcott and Taubinsky2015; Keefer & Rustamov, Reference Keefer and Rustamov2017). Allcott et al. (Reference Allcott, Knittel and Taubinsky2015) also suggest that if restricting eligibility is not institutionally feasible, targeted marketing at high-response groups would also generate savings and can enhance policy cost-effectiveness. To a certain degree targeted programs may address the market failures, which are also caused by behavioral biases.
Appendix