FUNCTIONAL SEQUENTIAL TREATMENT ALLOCATION WITH COVARIATES

Anders Bredahl Kock; David Preinerstorfer; Bezirgen Veliyev

doi:10.1017/S0266466623000051

FUNCTIONAL SEQUENTIAL TREATMENT ALLOCATION WITH COVARIATES

Published online by Cambridge University Press: 16 March 2023

Anders Bredahl Kock ,

David Preinerstorfer and

Bezirgen Veliyev

Show author details

Anders Bredahl Kock: Affiliation:
University of Oxford
David Preinerstorfer: Affiliation:
University of St.Gallen
Bezirgen Veliyev*: Affiliation:
Aarhus University
*: Address correspondence to Bezirgen Veliyev, Department of Economics and Business Economics, Aarhus University, Fuglesangs Alle 4, 8210 Aarhus V, Denmark; e-mail: [email protected]..

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

We consider a sequential treatment problem with covariates. Given a realization of the covariate vector, instead of targeting the treatment with highest conditional expectation, the decision-maker targets the treatment which maximizes a general functional of the conditional potential outcome distribution, e.g., a conditional quantile, trimmed mean, or a socioeconomic functional such as an inequality, welfare, or poverty measure. We develop expected regret lower bounds for this problem and construct a near minimax optimal sequential assignment policy.

Type: ARTICLES
Information: Econometric Theory , Volume 40 , Issue 6 , December 2024 , pp. 1211 - 1252

DOI: https://doi.org/10.1017/S0266466623000051 [Opens in a new window]
Copyright: © The Author(s), 2023. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

Footnotes

We are grateful to the Editor, a Co-Editor, and three anonymous referees for their valuable comments.

References

REFERENCES

Athey, S. & Wager, S. (2021) Policy learning with observational data. Econometrica 89, 133–161.CrossRef Google Scholar

Audibert, J.-Y. & Tsybakov, A.B. (2007) Fast learning rates for plug-in classifiers. Annals of Statistics 35, 608–633.CrossRef Google Scholar

Besson, L. & Kaufmann, E. (2018) What doubling tricks can and can’t do for multi-armed bandits. Preprint, arXiv:1803.06971.Google Scholar

Bhattacharya, D. & Dupas, P. (2012) Inferring welfare maximizing treatment assignment under budget constraints. Journal of Econometrics 167, 168–196.CrossRef Google Scholar

Bitler, M.P., Gelbach, J.B., & Hoynes, H.W. (2006) What mean impacts miss: Distributional effects of welfare reform experiments. American Economic Review 96, 988–1012.CrossRef Google Scholar

Cassel, A., Mannor, S., & Zeevi, A. (2018) A general approach to multi-armed bandits under risk criteria. In S. Bubeck, V. Perchet, & P. Rigollet (eds.), Proceedings of the 31st Conference on Learning Theory, vol. 75, pp. 1295–1306. PMLR.Google Scholar

Chakravarty, S.R. (2009) Inequality, Polarization and Poverty. Springer.CrossRef Google Scholar

Chamberlain, G. (2000) Econometrics and decision theory. Journal of Econometrics 95, 255–283.CrossRef Google Scholar

Cowell, F. (2011) Measuring Inequality. Oxford University Press.CrossRef Google Scholar

Currie, J.M. & MacLeod, W.B. (2020) Understanding doctor decision making: The case of depression treatment. Econometrica 88, 847–878.CrossRef Google Scholar PubMed

Dehejia, R.H. (2005) Program evaluation as a decision problem. Journal of Econometrics 125, 141–173.CrossRef Google Scholar

Folland, G.B. (1999) Real Analysis: Modern Techniques and Their Applications. Wiley.Google Scholar

Hirano, K. & Porter, J.R. (2009) Asymptotics for statistical treatment rules. Econometrica 77, 1683–1701.Google Scholar

Hirano, K. & Porter, J.R. (2020) Asymptotic analysis of statistical decision rules in econometrics. In S.N. Durlauf, L. Peter Hansen, J.J. Heckman, & R.L. Matzkin (eds.), Handbook of Econometrics, vol. 7A, pp. 283–354. Elsevier.CrossRef Google Scholar

Kallenberg, O. (2001) Foundations of Modern Probability, 2nd Edition. Springer Science & Business Media.Google Scholar

Kasy, M. & Sautmann, A. (2021) Adaptive treatment assignment in experiments for policy choice. Econometrica 89, 113–132.CrossRef Google Scholar

Kitagawa, T. & Tetenov, A. (2018) Who should be treated? Empirical welfare maximization methods for treatment choice. Econometrica 86, 591–616.CrossRef Google Scholar

Kitagawa, T. & Tetenov, A. (2019) Equality-minded treatment choice. Journal of Business & Economic Statistics 39, 561–574.CrossRef Google Scholar

Kock, A.B., Preinerstorfer, D., & Veliyev, B. (2022) Functional sequential treatment allocation. Journal of the American Statistical Association 117, 1311–1323.CrossRef Google Scholar

Kock, A.B. & Thyrsgaard, M. (2018) Optimal sequential treatment allocation. Preprint, arXiv:1705.09952.Google Scholar

Lambert, P.J. (2001) The Distribution and Redistribution of Income. Manchester University Press.Google Scholar

Liese, F. & Miescke, K.J. (2008) Statistical Decision Theory. Springer.Google Scholar

Ma, X., Zhang, Q., Xia, L., Zhou, Z., Yang, J., & Zhao, Q. (2020) Distributional soft actor critic for risk sensitive learning. Preprint, arXiv:2004.14547.Google Scholar

Maillard, O.-A. (2013) Robust risk-averse stochastic multi-armed bandits. In Jain, S., Munos, R., Stephan, F., & Zeugmann, T. (eds), Algorithmic Learning Theory, pp. 218–233. Springer.CrossRef Google Scholar

Mammen, E. & Tsybakov, A.B. (1999) Smooth discrimination analysis. Annals of Statistics 27, 1808–1829.CrossRef Google Scholar

Manski, C.F. (2004) Statistical treatment rules for heterogeneous populations. Econometrica 72, 1221–1246.CrossRef Google Scholar

Manski, C.F. (2019) Treatment choice with trial data: Statistical decision theory should supplant hypothesis testing. American Statistician 73, 296–304.CrossRef Google Scholar

Manski, C.F. & Tetenov, A. (2016) Sufficient trial size to inform clinical practice. Proceedings of the National Academy of Sciences 113, 10518–10523.CrossRef Google Scholar PubMed

McDonald, J.B. (1984) Some generalized functions for the size distribution of income. Econometrica 52, 647–663.CrossRef Google Scholar

McDonald, J.B. & Ransom, M. (2008) The generalized beta distribution as a model for the distribution of income: Estimation of related measures of inequality. In D. Chotikapanich (ed.), Modeling Income Distributions and Lorenz Curves, pp. 147–166. Springer.CrossRef Google Scholar

Perchet, V. & Rigollet, P. (2013) The multi-armed bandit problem with covariates. Annals of Statistics 41, 693–721.CrossRef Google Scholar

Rigollet, P. & Zeevi, A. (2010) Nonparametric bandits with covariates. In: In A.T. Kalai & M. Mohri (eds.), Proceedings title: 23rd Annual Conference on Learning Theory, pp. 54–66. Omnipress.Google Scholar

Rostek, M. (2010) Quantile maximization in decision theory. Review of Economic Studies 77, 339–371.CrossRef Google Scholar

Sani, A., Lazaric, A., & Munos, R. (2012) Risk-aversion in multi-armed bandits. In Pereira, F., Burges, C.J.C., Bottou, L., & Weinberger, K.Q. (eds), Advances in Neural Information Processing Systems 25, pp. 3275–3283. Curran Associates, Inc. Google Scholar

Sen, A. (1974) Informational bases of alternative welfare approaches: Aggregation and income distribution. Journal of Public Economics 3, 387–403.CrossRef Google Scholar

Serfling, R.J. (1984) Generalized L-, M-, and R-statistics. Annals of Statistics 12, 76–86.CrossRef Google Scholar

Shalev-Shwartz, S. (2012) Online learning and online convex optimization . Foundations and Trends® in Machine Learning 4, 107–194.CrossRef Google Scholar

Si, N., Zhang, F., Zhou, Z., & Blanchet, J. (2020b) Distributionally robust policy evaluation and learning in offline contextual bandits. In H. Daume III & A. Singh (eds.), International Conference on Machine Learning, pp. 8884–8894. PMLR.Google Scholar

Si, N., Zhang, F., Zhou, Z., & Blanchet, J. (2020a) Distributional robust batch contextual bandits. Preprint, arXiv:2006.05630.Google Scholar

Stoye, J. (2009) Minimax regret treatment choice with finite samples. Journal of Econometrics 151, 70–81.CrossRef Google Scholar

Stoye, J. (2012) Minimax regret treatment choice with covariates or with limited validity of experiments. Journal of Econometrics 166, 138–156.CrossRef Google Scholar

Tetenov, A. (2012) Statistical treatment choice based on asymmetric minimax regret criteria. Journal of Econometrics 166, 157–165.CrossRef Google Scholar

Thurow, L.C. (1970) Analyzing the American income distribution. American Economic Review 60, 261–269.Google Scholar

Tran-Thanh, L. & Yu, J.Y. (2014) Functional bandits. Preprint, arXiv:1405.2432.Google Scholar

Tsybakov, A.B. (2004) Optimal aggregation of classifiers in statistical learning. Annals of Statistics 32, 135–166.CrossRef Google Scholar

Tsybakov, A.B. (2009) Introduction to Nonparametric Estimation. Springer.CrossRef Google Scholar

Vakili, S., Boukouvalas, A., & Zhao, Q. (2019) Decision variance in online learning. Preprint, arXiv:1807.09089.Google Scholar

Vakili, S. & Zhao, Q. (2016) Risk-averse multi-armed bandit problems under mean-variance measure. IEEE Journal of Selected Topics in Signal Processing 10, 1093–1111.CrossRef Google Scholar

Woodroofe, M. (1979) A one-armed bandit problem with a concomitant variable. Journal of the American Statistical Association 74, 799–806.CrossRef Google Scholar

Yang, Y. & Zhu, D. (2002) Randomized allocation with nonparametric estimation for a multi-armed bandit problem with covariates. Annals of Statistics 30, 100–121.CrossRef Google Scholar

Zhou, Z., Athey, S., & Wager, S. (2022) Offline multi-action policy learning: Generalization and optimization. Operations Research 71, 148–183.CrossRef Google Scholar

Zhou, Z., Zhou, Z., Bai, Q., Qiu, L., Blanchet, J., & Glynn, P. (2021b) Finite-sample regret bound for distributionally robust offline tabular reinforcement learning. In A. Banerjee & K. Fukumizu (eds.), International Conference on Artificial Intelligence and Statistics, pp. 3331–3339. PMLR.Google Scholar

Zimin, A., Ibsen-Jensen, R., & Chatterjee, K. (2014) Generalized risk-aversion in stochastic multi-armed bandits. Preprint, arXiv:1405.0833.Google Scholar

Article contents

FUNCTIONAL SEQUENTIAL TREATMENT ALLOCATION WITH COVARIATES

Abstract

Access options

Article purchase

Temporarily unavailable

Footnotes

References

REFERENCES

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests