Hostname: page-component-745bb68f8f-grxwn Total loading time: 0 Render date: 2025-01-12T15:04:02.262Z Has data issue: false hasContentIssue false

Randomized numerical linear algebra: Foundations and algorithms

Published online by Cambridge University Press:  30 November 2020

Per-Gunnar Martinsson
Affiliation:
Department of Mathematics, University of Texas at AustinAustin, TX78712, USA E-mail: [email protected]
Joel A. Tropp
Affiliation:
Computing & Mathematical Sciences, California Institute of TechnologyPasadena, CA91125, USA E-mail: [email protected]

Abstract

This survey describes probabilistic algorithms for linear algebraic computations, such as factorizing matrices and solving linear systems. It focuses on techniques that have a proven track record for real-world problems. The paper treats both the theoretical foundations of the subject and practical computational issues.

Topics include norm estimation, matrix approximation by sampling, structured and unstructured random embeddings, linear regression problems, low-rank approximation, subspace iteration and Krylov methods, error estimation and adaptivity, interpolatory and CUR factorizations, Nyström approximation of positive semidefinite matrices, single-view (‘streaming’) algorithms, full rank-revealing factorizations, solvers for linear systems, and approximation of kernel matrices that arise in machine learning and in scientific computing.

Type
Research Article
Copyright
© The Author(s), 2020. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

References 2

Achlioptas, D. (2003), ‘Database-friendly random projections: Johnson–Lindenstrauss with binary coins’, J. Comput. System Sci. 66, 671687.Google Scholar
Achlioptas, D. and McSherry, F. (2001), Fast computation of low rank matrix approximations. In 33rd Annual ACM Symposium on Theory of Computing, ACM, pp. 611618.Google Scholar
Achlioptas, D. and McSherry, F. (2007), ‘Fast computation of low-rank matrix approximations’, J. Assoc. Comput. Mach. 54, 9.CrossRefGoogle Scholar
Ahlswede, R. and Winter, A. (2002), ‘Strong converse for identification via quantum channels’, IEEE Trans. Inform. Theory 48, 569579.CrossRefGoogle Scholar
Ailon, N. and Chazelle, B. (2006), Approximate nearest neighbors and the fast Johnson–Lindenstrauss transform. In 38th Annual ACM Symposium on Theory of Computing, ACM, pp. 557563.Google Scholar
Ailon, N. and Chazelle, B. (2009), ‘The fast Johnson–Lindenstrauss transform and approximate nearest neighbors’, SIAM J. Comput. 39, 302322.Google Scholar
Alaoui, A. and Mahoney, M. W. (2015), Fast randomized kernel ridge regression with statistical guarantees. In Advances in Neural Information Processing Systems 28 (Cortes, C. et al., eds), Curran Associates, pp. 775783.Google Scholar
Alon, N., Gibbons, P. B., Matias, Y. and Szegedy, M. (2002), ‘Tracking join and self-join sizes in limited storage’, J. Comput. System Sci. 64, 719747.CrossRefGoogle Scholar
Alon, N., Matias, Y. and Szegedy, M. (1999), ‘The space complexity of approximating the frequency moments’, J. Comput. System Sci. 58, 137147.Google Scholar
Ambikasaran, S., Foreman-Mackey, D., Greengard, L., Hogg, D. W. and O’Neil, M. (2015), ‘Fast direct methods for Gaussian processes’, IEEE Trans. Pattern Anal. Machine Intel. 38, 252265.CrossRefGoogle Scholar
Amelunxen, D., Lotz, M., McCoy, M. B. and Tropp, J. A. (2014), ‘Living on the edge: Phase transitions in convex programs with random data’, Inf. Inference 3, 224294.CrossRefGoogle Scholar
Ando, T. (2005), Schur complements and matrix inequalities: Operator-theoretic approach. In The Schur Complement and its Applications (Zhang, F., ed.), Vol. 4 of Numerical Methods and Algorithms, Springer, pp. 137162.CrossRefGoogle Scholar
Arcones, M. A. and Gine, E. (1992), ‘On the bootstrap of $u$ and $v$ statistics’, Ann. Statist. 20, 655674.Google Scholar
Arras, B., Bachmayr, M. and Cohen, A. (2019), ‘Sequential sampling for optimal weighted least squares approximations in hierarchical spaces’, SIAM J. Math. Data Sci. 1, 189207.CrossRefGoogle Scholar
Avron, H. (2018), Randomized Riemannian preconditioning for quadratically constrained problems. Slides, Workshop on Randomized Numerical Linear Algebra, Simons Institute, UC Berkeley.Google Scholar
Avron, H., Clarkson, K. and Woodruff, D. (2017), ‘Faster kernel ridge regression using sketching and preconditioning’, SIAM J. Matrix Anal. Appl. 38, 11161138.CrossRefGoogle Scholar
Avron, H., Druinsky, A. and Gupta, A. (2015), ‘Revisiting asynchronous linear solvers: Provable convergence rate through randomization’, J. Assoc. Comput. Mach. 62, 51.CrossRefGoogle Scholar
Avron, H., Kapralov, M., Musco, C., Musco, C., Velingker, A. and Zandieh, A. (2019), A universal sampling method for reconstructing signals with simple Fourier transforms. In 51st Annual ACM SIGACT Symposium on Theory of Computing (STOC 2019), ACM, pp. 10511063.CrossRefGoogle Scholar
Avron, H., Maymounkov, P. and Toledo, S. (2010), ‘Blendenpik: Supercharging LAPACK’s least-squares solver’, SIAM J. Sci. Comput. 32, 12171236.CrossRefGoogle Scholar
Baboulin, M., Dongarra, J. J., Rémy, A., Tomov, S. and Yamazaki, I. (2017), ‘Solving dense symmetric indefinite systems using GPUs’, Concurr. Comput. Pract. Exp. 29, e4055.CrossRefGoogle Scholar
Baboulin, M., Li, X. S. and Rouet, F.-H. (2014), Using random butterfly transformations to avoid pivoting in sparse direct methods. In International Conference on High Performance Computing for Computational Science (VECPAR 2014), Vol. 8969 of Lecture Notes in Computer Science, Springer, pp. 135144.Google Scholar
Bach, F. (2013), Sharp analysis of low-rank kernel matrix approximations. In 26th Annual Conference on Learning Theory, Vol. 30 of JMLR Workshop and Conference Proceedings, PMLR, pp. 185209.Google Scholar
Bach, F. (2017), ‘On the equivalence between kernel quadrature rules and random feature expansions’, J. Mach. Learn. Res. 18, 138.Google Scholar
Bach, F. R. and Jordan, M. (2005), Predictive low-rank decomposition for kernel methods. In 22nd International Conference on Machine Learning (ICML ’05), ACM, pp. 3340.Google Scholar
Bai, Z. and Silverstein, J. W. (2010), Spectral Analysis of Large Dimensional Random Matrices, second edition, Springer Series in Statistics, Springer.CrossRefGoogle Scholar
Bai, Z., Demmel, J., Dongarra, J., Ruhe, A. and van der Vorst, H. (1987), Templates for the Solution of Algebraic Eigenvalue Problems: A Practical Guide (Software, Environments and Tools), first edition, SIAM.Google Scholar
Baldi, P. and Vershynin, R. (2019), ‘Polynomial threshold functions, hyperplane arrangements, and random tensors’, SIAM J. Math. Data Sci. 1, 699729.CrossRefGoogle Scholar
Ballard, G., Demmel, J., Dumitriu, I. and Rusciano, A. (2019), A generalized randomized rank-revealing factorization. arXiv:1909.06524 Google Scholar
Banks, J., Vargas, J. G., Kulkarni, A. and Srivastava, N. (2019), Pseudospectral shattering, the sign function, and diagonalization in nearly matrix multiplication time. arXiv:1912.08805 Google Scholar
Barnes, J. and Hut, P. (1986), ‘A hierarchical $O(NlogN)$ force-calculation algorithm’, Nature 324, 446449.CrossRefGoogle Scholar
Bartlett, P. (2013), U-statistics. Berkeley Statistics 210B Lecture Notes.Google Scholar
Batson, J., Spielman, D. A. and Srivastava, N. (2014), ‘Twice-Ramanujan sparsifiers’, SIAM Rev. 56, 315334.CrossRefGoogle Scholar
Bebendorf, M. and Grzhibovskis, R. (2006), ‘Accelerating Galerkin BEM for linear elasticity using adaptive cross approximation’, Math. Methods Appl. Sci. 29, 17211747.CrossRefGoogle Scholar
Bhatia, R. (1997), Matrix Analysis, Vol. 169 of Graduate Texts in Mathematics, Springer.CrossRefGoogle Scholar
Bischof, C. H. and Quintana-Ortí, G. (1998), ‘Computing rank-revealing QR factorizations of dense matrices’, ACM Trans. Math. Software 24, 226253.CrossRefGoogle Scholar
Bochner, S. (1933), ‘Monotone Funktionen, Stieltjessche Integrale und harmonische Analyse’, Math. Ann. 108, 378410.CrossRefGoogle Scholar
Bottou, L. (2010), Large-scale machine learning with stochastic gradient descent. In Proceedings of COMPSTAT ’2010 (Lechevallier, Y. and Saporta, G., eds), Physica-Verlag HD, pp. 177186.CrossRefGoogle Scholar
Boucheron, S., Lugosi, G. and Massart, P. (2013), Concentration Inequalities: A Nonasymptotic Theory of Independence, Oxford University Press.Google Scholar
Bourgain, J., Dirksen, S. and Nelson, J. (2015), ‘Toward a unified theory of sparse dimensionality reduction in Euclidean space’, Geom. Funct. Anal. 25, 10091088.CrossRefGoogle Scholar
Boutsidis, C., Mahoney, M. W. and Drineas, P. (2009), An improved approximation algorithm for the column subset selection problem. In 20th Annual ACM–SIAM Symposium on Discrete Algorithms, SIAM, pp. 968977.Google Scholar
Boutsidis, C., Woodruff, D. P. and Zhong, P. (2016), Optimal principal component analysis in distributed and streaming models. In 48th Annual ACM Symposium on Theory of Computing, ACM, pp. 236249.Google Scholar
Briggs, W. L. and Henson, V. E. (1995), The DFT: An Owner’s Manual for the Discrete Fourier Transform, SIAM.Google Scholar
Bringmann, K. and Panagiotou, K. (2017), ‘Efficient sampling methods for discrete distributions’, Algorithmica 79, 484508.CrossRefGoogle Scholar
Candès, E., Demanet, L. and Ying, L. (2009), ‘A fast butterfly algorithm for the computation of Fourier integral operators’, Multiscale Model. Simul. 7, 17271750.CrossRefGoogle Scholar
Carl, B. (1985), ‘Inequalities of Bernstein–Jackson-type and the degree of compactness of operators in Banach spaces’, Ann. Inst. Fourier (Grenoble) 35, 79118.Google Scholar
Carmon, Y., Duchi, J. C., Sidford, A. and Tian, K. (2019), A rank-1 sketch for matrix multiplicative weights. In 32nd Conference on Learning Theory (Beygelzimer, A. and Hsu, D., eds), Vol. 99 of Proceedings of Machine Learning Research, PMLR, pp. 589623.Google Scholar
Chandrasekaran, V., Recht, B., Parrilo, P. A. and Willsky, A. S. (2012), ‘The convex geometry of linear inverse problems’, Found. Comput. Math. 12, 805849.Google Scholar
Charikar, M., Chen, K. and Farach-Colton, M. (2004), ‘Finding frequent items in data streams’, Theoret. Comput. Sci. 312, 315.Google Scholar
Chen, L. H. Y., Goldstein, L. and Shao, Q.-M. (2011), Normal Approximation by Stein’s Method, Probability and its Applications (New York), Springer.CrossRefGoogle Scholar
Chen, X. and Price, E. (2019), Active regression via linear-sample sparsification. In 32nd Conference on Learning Theory (Beygelzimer, A. and Hsu, D., eds), Vol. 99 of Proceedings of Machine Learning Research, PMLR, pp. 663695.Google Scholar
Chen, Z. and Dongarra, J. J. (2005), ‘Condition numbers of Gaussian random matrices’, SIAM J. Matrix Anal. Appl. 27, 603620.CrossRefGoogle Scholar
Cheng, H., Gimbutas, Z., Martinsson, P.-G. and Rokhlin, V. (2005), ‘On the compression of low rank matrices’, SIAM J. Sci. Comput. 26, 13891404.CrossRefGoogle Scholar
Clarkson, K. L. and Woodruff, D. P. (2009), Numerical linear algebra in the streaming model. In 41st Annual ACM Symposium on Theory of Computing, ACM, pp. 205214.CrossRefGoogle Scholar
Clarkson, K. L. and Woodruff, D. P. (2013), Low rank approximation and regression in input sparsity time. In 2013 ACM Symposium on Theory of Computing (STOC ’13), ACM, pp. 8190.Google Scholar
Cohen, A. and Migliorati, G. (2017), ‘Optimal weighted least-squares methods’, SIAM J. Comput. Math. 3, 181203.CrossRefGoogle Scholar
Cohen, A., Davenport, M. A. and Leviatan, D. (2013), ‘On the stability and accuracy of least squares approximations’, Found. Comput. Math. 13, 819834.CrossRefGoogle Scholar
Cohen, E. and Lewis, D. D. (1999), ‘Approximating matrix multiplication for pattern recognition tasks’, J. Algorithms 30, 211252.Google Scholar
Cohen, M. B. (2016), Nearly tight oblivious subspace embeddings by trace inequalities. In 27th Annual ACM–SIAM Symposium on Discrete Algorithms, ACM, pp. 278287.Google Scholar
Cohen, M. B., Kyng, R., Miller, G. L., Pachocki, J. W., Peng, R., Rao, A. B. and Xu, S. C. (2014), Solving SDD linear systems in nearly time. In 46th Annual ACM Symposium on Theory of Computing (STOC ’14), ACM, pp. 343352.CrossRefGoogle Scholar
Cullum, J. and Donath, W. E. (1974), A block generalization of the $s$ -step Lanczos algorithm. Report RC 4845 (21570), IBM Thomas J. Watson Research Center, Yorktown Heights, New York.Google Scholar
Dao, T., Gu, A., Eichhorn, M., Rudra, A. and , C. (2019), Learning fast algorithms for linear transforms using butterfly factorizations. In 36th International Conference on Machine Learning (Chaudhuri, K. and Salakhutdinov, R., eds), Vol. 97 of Proceedings of Machine Learning Research, PMLR, pp. 15171527.Google Scholar
Davidson, K. R. and Szarek, S. J. (2001), Local operator theory, random matrices and Banach spaces. In Handbook of the Geometry of Banach Spaces, Vol. I, North-Holland, pp. 317366.Google Scholar
Davis, T. A., Rajamanickam, S. and Sid-Lakhdar, W. M. (2016), A survey of direct methods for sparse linear systems. In Acta Numerica, Vol. 25, Cambridge University Press, pp. 383566.Google Scholar
Demmel, J., Dumitriu, I. and Holtz, O. (2007), ‘Fast linear algebra is stable’, Numer. Math. 108, 5991.Google Scholar
Demmel, J., Grigori, L., Hoemmen, M. and Langou, J. (2012), ‘Communication-optimal parallel and sequential QR and LU factorizations’, SIAM J. Sci. Comput. 34, A206A239.CrossRefGoogle Scholar
Demmel, J. W., Grigori, L., Gu, M. and Xiang, H. (2015), ‘Communication avoiding rank revealing QR factorization with column pivoting’, SIAM J. Matrix Anal. Appl. 36, 5589.Google Scholar
Dixon, J. D. (1983), ‘Estimating extremal eigenvalues and condition numbers of matrices’, SIAM J. Numer. Anal. 20, 812814.CrossRefGoogle Scholar
Dobriban, E. and Liu, S. (2019), Asymptotics for sketching in least squares regression. In Advances in Neural Information Processing Systems 32 (Wallach, H. et al., eds), Curran Associates, pp. 36753685.Google Scholar
Drineas, P. and Mahoney, M. W. (2018), Lectures on randomized linear algebra. In The Mathematics of Data (Mahoney, M. W., Duchi, J. and Gilbert, A., eds), Vol. 25 of IAS/Park City Mathematics Series, AMS, pp. 148.CrossRefGoogle Scholar
Drineas, P. and Mahoney, M. W. (2005), ‘On the Nyström method for approximating a Gram matrix for improved kernel-based learning’, J. Mach. Learn. Res. 6, 21532175.Google Scholar
Drineas, P., Kannan, R. and Mahoney, M. W. (2006 a), ‘Fast Monte Carlo algorithms for matrices, I: Approximating matrix multiplication’, SIAM J. Comput. 36, 132157.CrossRefGoogle Scholar
Drineas, P., Kannan, R. and Mahoney, M. W. (2006 b), ‘Fast Monte Carlo algorithms for matrices, II: Computing a low-rank approximation to a matrix’, SIAM J. Comput. 36, 158183.CrossRefGoogle Scholar
Drineas, P., Kannan, R. and Mahoney, M. W. (2006 c), ‘Fast Monte Carlo algorithms for matrices, III: Computing a compressed approximate matrix decomposition’, SIAM J. Comput. 36, 184206.Google Scholar
Drineas, P., Mahoney, M. W. and Muthukrishnan, S. (2006 d), Subspace sampling and relative-error matrix approximation: Column-based methods. In Approximation, Randomization and Combinatorial Optimization, Vol. 4110 of Lecture Notes in Computer Science, Springer, pp. 316326.CrossRefGoogle Scholar
Drineas, P., Mahoney, M. W. and Muthukrishnan, S. (2008), ‘Relative-error CUR matrix decompositions’, SIAM J. Matrix Anal. Appl. 30, 844881.CrossRefGoogle Scholar
Duersch, J. A. and Gu, M. (2015), True BLAS-3 performance QRCP using random sampling. arXiv:1509.06820v1 Google Scholar
Duersch, J. A. and Gu, M. (2017), ‘Randomized QR with column pivoting’, SIAM J. Sci. Comput. 39, C263C291.Google Scholar
Edelman, A. S. (1989), Eigenvalues and condition numbers of random matrices. ProQuest LLC, Ann Arbor, MI. PhD thesis, Massachusetts Institute of Technology.Google Scholar
Efron, B. (1982), The Jackknife, the Bootstrap and Other Resampling Plans, Vol. 38 of CBMS-NSF Regional Conference Series in Applied Mathematics, SIAM.CrossRefGoogle Scholar
Feldman, D., Volkov, M. and Rus, D. (2016), Dimensionality reduction of massive sparse datasets using coresets. In Advances in Neural Information Processing Systems 29 (Lee, D. D. et al., eds), Curran Associates, pp. 27662774.Google Scholar
Feng, Y., Xiao, J. and Gu, M. (2019), ‘Flip-flop spectrum-revealing QR factorization and its applications to singular value decomposition’, Electron. Trans. Numer. Anal. 51, 469494.CrossRefGoogle Scholar
Fierro, R. D., Hansen, P. C. and Hansen, P. S. K. (1999), ‘UTV tools: Matlab templates for rank-revealing UTV decompositions’, Numer. Algorithms 20, 165194.Google Scholar
Fine, S. and Scheinberg, K. (2001), ‘Efficient SVM training using low-rank kernel representation’, J. Mach. Learn. Res. 2, 243264.Google Scholar
Fitzsimons, J. K., Osborne, M. A., Roberts, S. J. and Fitzsimons, J. F. (2018), Improved stochastic trace estimators using mutually unbiased bases. In Uncertainty in Artificial Intelligence: Proceedings of the Thirty-Fourth Conference (Globerson, A. and Silva, R., eds), AUAI Press.Google Scholar
Foucart, S. and Rauhut, H. (2013), A Mathematical Introduction to Compressive Sensing, Applied and Numerical Harmonic Analysis, Birkhäuser/Springer.CrossRefGoogle Scholar
Frieze, A., Kannan, R. and Vempala, S. (2004), ‘Fast Monte-Carlo algorithms for finding low-rank approximations’, J. Assoc. Comput. Mach. 51, 10251041.CrossRefGoogle Scholar
Ghashami, M., Liberty, E., Phillips, J. M. and Woodruff, D. P. (2016 a), ‘Frequent directions: Simple and deterministic matrix sketching’, SIAM J. Comput. 45, 17621792.Google Scholar
Ghashami, M., Perry, D. J. and Phillips, J. (2016 b), Streaming kernel principal component analysis. In 19th International Conference on Artificial Intelligence and Statistics (Gretton, A. and Robert, C. C., eds), Vol. 51 of Proceedings of Machine Learning Research, PMLR, pp. 13651374.Google Scholar
Ghysels, P., Li, X. S., Gorman, C. and Rouet, F.-H. (2017), A robust parallel preconditioner for indefinite systems using hierarchical matrices and randomized sampling. In 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS), IEEE, pp. 897906.CrossRefGoogle Scholar
Gionis, A., Indyk, P. and Motwani, R. (1999), Similarity search in high dimensions via hashing. In 25th International Conference on Very Large Data Bases (VLDB ’99), Morgan Kaufmann, pp. 518529.Google Scholar
Girard, D. A. (1989), ‘A fast “Monte Carlo cross-validation” procedure for large least squares problems with noisy data’, Numer. Math. 56, 123.Google Scholar
Gittens, A. (2013), Topics in Randomized Numerical Linear Algebra. ProQuest LLC, Ann Arbor, MI. PhD thesis, California Institute of Technology.Google Scholar
Goldstein, L., Nourdin, I. and Peccati, G. (2017), ‘Gaussian phase transitions and conic intrinsic volumes: Steining the Steiner formula’, Ann. Appl. Probab. 27, 147.CrossRefGoogle Scholar
Golub, G. H. and Meurant, G. (1994), Matrices, moments and quadrature. In Numerical Analysis 1993, Vol. 303 of Pitman Research Notes in Mathematics Series, Longman Scientific and Technical, pp. 105156.Google Scholar
Golub, G. H. and Meurant, G. (2010), Matrices, Moments and Quadrature with Applications, Princeton Series in Applied Mathematics, Princeton University Press.CrossRefGoogle Scholar
Golub, G. H. and Underwood, R. (1977), The block Lanczos method for computing eigenvalues. In Mathematical Software III (Proceedings of a Symposium Conducted by the Mathematics Research Center, the University of Wisconsin–Madison), Academic Press, pp. 361377.Google Scholar
Golub, G. H. and Van Loan, C. F. (2013), Matrix Computations, fourth edition, Johns Hopkins Studies in the Mathematical Sciences, The Johns Hopkins University Press.Google Scholar
Golub, G. H., Luk, F. T. and Overton, M. L. (1981), ‘A block Lánczos method for computing the singular values of corresponding singular vectors of a matrix’, ACM Trans. Math. Software 7, 149169.CrossRefGoogle Scholar
Good, I. J. (1977), ‘A new formula for $k$ -statistics’, Ann. Statist. 5, 224228.CrossRefGoogle Scholar
Gopal, A. and Martinsson, P.-G. (2018), The PowerURV algorithm for computing rank-revealing full factorizations. arXiv:1812.06007 Google Scholar
Gordon, Y. (1988), On Milman’s inequality and random subspaces which escape through a mesh in ${\mathbb{R}}^n$ . In Geometric Aspects of Functional Analysis (1986/87), Vol. 1317 of Lecture Notes in Mathematics, Springer, pp. 84106.CrossRefGoogle Scholar
Goreinov, S. A., Oseledets, I. V., Savostyanov, D. V., Tyrtyshnikov, E. E. and Zamarashkin, N. L. (2010), How to find a good submatrix. In Matrix Methods: Theory, Algorithms And Applications: Dedicated to the Memory of Gene Golub, World Scientific, pp. 247256.CrossRefGoogle Scholar
Goreinov, S. A., Zamarashkin, N. L. and Tyrtyshnikov, E. E. (1997), ‘Pseudo-skeleton approximations by matrices of maximal volume’, Math. Notes 62, 515519.CrossRefGoogle Scholar
Gower, R. and Richtárik, P. (2015), ‘Randomized iterative methods for linear systems’, SIAM J. Matrix Anal. Appl. 36, 16601690.CrossRefGoogle Scholar
Gower, R., Hanzely, F., Richtárik, P. and Stich, S. U. (2018), Accelerated stochastic matrix inversion: General theory and speeding up BFGS rules for faster second-order optimization. In Advances in Neural Information Processing Systems 31 (Bengio, S. et al., eds), Curran Associates, pp. 16191629.Google Scholar
Grasedyck, L. and Hackbusch, W. (2003), ‘Construction and arithmetics of $\mathbf{\mathcal{H}}$ -matrices’, Computing 70, 295334.CrossRefGoogle Scholar
Gratton, S. and Titley-Peloquin, D. (2018), ‘Improved bounds for small-sample estimation’, SIAM J. Matrix Anal. Appl. 39, 922931.CrossRefGoogle Scholar
Greengard, L. and Rokhlin, V. (1987), ‘A fast algorithm for particle simulations’, J. Comput. Phys. 73, 325348.CrossRefGoogle Scholar
Grimmett, G. R. and Stirzaker, D. R. (2001), Probability and Random Processes, third edition, Oxford University Press.Google Scholar
Gu, M. (2015), ‘Subspace iteration randomization and singular value problems’, SIAM J. Sci. Comput. 37, A1139A1173.Google Scholar
Gu, M. and Eisenstat, S. C. (1996), ‘Efficient algorithms for computing a strong rank-revealing QR factorization’, SIAM J. Sci. Comput. 17, 848869.CrossRefGoogle Scholar
Guo, H., Liu, Y., Hu, J. and Michielssen, E. (2017), ‘A butterfly-based direct integral-equation solver using hierarchical LU factorization for analyzing scattering from electrically large conducting objects’, IEEE Trans. Antennas Propagation 65, 47424750.CrossRefGoogle Scholar
Hackbusch, W. (1999), ‘A sparse matrix arithmetic based on H-matrices, I: Introduction to H-matrices’, Computing 62, 89108.CrossRefGoogle Scholar
Hackbusch, W., Khoromskij, B. and Sauter, S. (2002), On ${\mathbf{\mathcal{H}}}^2$ -matrices. In Lectures on Applied Mathematics (Bungartz, H.-J., Hoppe, R. H. W. and Zenger, C., eds), Springer, pp. 929.Google Scholar
Halko, N., Martinsson, P. G. and Tropp, J. A. (2011 a), ‘Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions’, SIAM Rev. 53, 217288.CrossRefGoogle Scholar
Halko, N., Martinsson, P.-G., Shkolnisky, Y. and Tygert, M. (2011 b), ‘An algorithm for the principal component analysis of large data sets’, SIAM J. Sci. Comput. 33, 25802594.CrossRefGoogle Scholar
Hamid, R., Xiao, Y., Gittens, A. and Decoste, D. (2014), Compact random feature maps. In 31st International Conference on Machine Learning (Xing, E. P. and Jebara, T., eds), Vol. 32 of Proceedings of Machine Learning Research, PMLR, pp. 1927.Google Scholar
Hampton, J. and Doostan, A. (2015), ‘Coherence motivated sampling and convergence analysis of least squares polynomial chaos regression’, Comput. Methods Appl. Mech. Engrg 290, 7397.CrossRefGoogle Scholar
Hennig, P. and Osborne, M. A. (2019), probabilistic-numerics.org Google Scholar
Hestenes, M. R. and Stiefel, E. (1952), ‘Methods of conjugate gradients for solving linear systems’, J. Res. Nat. Bur. Standards 49, 2379.CrossRefGoogle Scholar
Horn, R. A. and Johnson, C. R. (2013), Matrix Analysis, second edition, Cambridge University Press.Google Scholar
Hutchinson, M. F. (1990), ‘A stochastic estimator of the trace of the influence matrix for Laplacian smoothing splines’, Comm. Statist. Simul. Comput. 19, 433450.CrossRefGoogle Scholar
Indyk, P. and Motwani, R. (1999), Approximate nearest neighbors: Towards removing the curse of dimensionality. In 30th Annual ACM Symposium on Theory of Computing (STOC ’98), ACM, pp. 604613.Google Scholar
Jin, R., Kolda, T. G. and Ward, R. (2019), Faster Johnson–Lindenstrauss transforms via Kronecker products. arXiv:1909.04801 Google Scholar
Johnson, W. B. and Lindenstrauss, J. (1984), Extensions of Lipschitz mappings into a Hilbert space. In Conference in Modern Analysis and Probability (New Haven, Conn., 1982), Vol. 26 of Contemporary Mathematics, AMS, pp. 189206.CrossRefGoogle Scholar
Johnstone, I. M. (2001), ‘On the distribution of the largest eigenvalue in principal components analysis’, Ann. Statist. 29, 295327.CrossRefGoogle Scholar
Joubert, W. D. and Carey, G. F. (1991), Parellelizable restarted iterative methods for nonsymmetric linear systems. Center for Numerical Analysis Report CNA-251, UT-Austin.CrossRefGoogle Scholar
Kac, M. (1956), Foundations of kinetic theory. In Third Berkeley Symposium on Mathematical Statistics and Probability, 1954–1955, Vol. III, University of California Press, pp. 171197.Google Scholar
Kahan, W. (1966), ‘Numerical linear algebra’, Canad. Math. Bull. 9, 757801.Google Scholar
Kannan, R. and Vempala, S. (2017), Randomized algorithms in numerical linear algebra. In Acta Numerica, Cambridge University Press, Vol. 26, pp. 95135.Google Scholar
Kar, P. and Karnick, H. (2012), Random feature maps for dot product kernels. In 15th International Conference on Artificial Intelligence and Statistics (Lawrence, N. D. and Girolami, M., eds), Vol. 22 of Proceedings of Machine Learning Research, PMLR, pp. 583591.Google Scholar
Kasiviswanathan, S. P., Rudelson, M., Smith, A. and Ullman, J. (2010), The price of privately releasing contingency tables and the spectra of random matrices with correlated rows. In 2010 ACM International Symposium on Theory of Computing (STOC ’10), ACM, pp. 775784.Google Scholar
Kong, W. and Valiant, G. (2017), ‘Spectrum estimation from samples’, Ann. Statist. 45, 22182247.Google Scholar
Koroljuk, V. S. and Borovskich, Y. V. (1994), Theory of $U$ -Statistics, Vol. 273 of Mathematics and its Applications, Kluwer Academic.CrossRefGoogle Scholar
Krahmer, F. and Ward, R. (2011), ‘New and improved Johnson–Lindenstrauss embeddings via the restricted isometry property’, SIAM J. Math. Anal. 43, 12691281.Google Scholar
Kressner, D., Steinlechner, M. and Vandereycken, B. (2016), ‘Preconditioned low-rank Riemannian optimization for linear systems with tensor product structure’, SIAM J. Sci. Comput. 38, A2018A2044.CrossRefGoogle Scholar
Kuczyński, J. and Woźniakowski, H. (1992), ‘Estimating the largest eigenvalue by the power and Lanczos algorithms with a random start’, SIAM J. Matrix Anal. Appl. 13, 10941122.CrossRefGoogle Scholar
Kueng, R. (2019), 2-designs minimize variance of trace estimators. Unpublished manuscript.Google Scholar
Kumar, S., Mohri, M. and Talwalkar, A. (2012), ‘Sampling methods for the Nyström method’, J. Mach. Learn. Res. 13, 9811006.Google Scholar
Kurz, S., Rain, O. and Rjasanow, S. (2002), ‘The adaptive cross-approximation technique for the 3D boundary-element method’, IEEE Trans. Magnetics 38, 421424.Google Scholar
Kyng, R. (2017), Approximate Gaussian elimination. ProQuest LLC, Ann Arbor, MI. PhD thesis, Yale University.Google Scholar
Kyng, R. and Sachdeva, S. (2016), Approximate Gaussian elimination for Laplacians: Fast, sparse, and simple. In 57th Annual IEEE Symposium on Foundations of Computer Science (FOCS 2016), IEEE, pp. 573582.CrossRefGoogle Scholar
Kyng, R., Lee, Y. T., Peng, R., Sachdeva, S. and Spielman, D. A. (2016), Sparsified Cholesky and multigrid solvers for connection Laplacians. In 48th Annual ACM SIGACT Symposium on Theory of Computing (STOC ’16), ACM, pp. 842850.Google Scholar
Le, Q., Sarlós, T. and Smola, A. (2013), Fastfood: Computing Hilbert space expansions in loglinear time. In 30th International Conference on Machine Learning (Dasgupta, S. and McAllester, D., eds), Vol. 28 of Proceedings of Machine Learning Research, PMLR, pp. 244252.Google Scholar
Ledoux, M. and Talagrand, M. (1991), Probability in Banach Spaces: Isoperimetry and Processes, Vol. 23 of Ergebnisse der Mathematik und ihrer Grenzgebiete [Results in Mathematics and Related Areas] (3), Springer.CrossRefGoogle Scholar
Leventhal, D. and Lewis, A. S. (2010), ‘Randomized methods for linear constraints: Convergence rates and conditioning’, Math. Oper. Res. 35, 641654.CrossRefGoogle Scholar
Leyk, Z. and Woźniakowski, H. (1998), ‘Estimating a largest eigenvector by Lanczos and polynomial algorithms with a random start’, Numer. Linear Algebra Appl. 5, 147164.3.0.CO;2-2>CrossRefGoogle Scholar
Li, H., Linderman, G. C., Szlam, A., Stanton, K. P., Kluger, Y. and Tygert, M. (2017), ‘Algorithm 971: An implementation of a randomized algorithm for principal component analysis’, ACM Trans. Math. Software 43, Art. 28, 14.CrossRefGoogle Scholar
Li, M., Bi, W., Kwok, J. T. and Lu, B. (2015 a), ‘Large-scale Nyström kernel matrix approximation using randomized SVD’, IEEE Trans. Neural Networks Learning Syst. 26, 152164.Google ScholarPubMed
Li, Y. and Yang, H. (2017), ‘Interpolative butterfly factorization’, SIAM J. Sci. Comput. 39, A503A531.CrossRefGoogle Scholar
Li, Y., Nguyen, H. L. and Woodruff, D. P. (2014 a), On sketching matrix norms and the top singular vector. In 25th Annual ACM–SIAM Symposium on Discrete Algorithms, ACM, pp. 15621581.Google Scholar
Li, Y., Nguyen, H. L. and Woodruff, D. P. (2014 b), Turnstile streaming algorithms might as well be linear sketches. In 2014 ACM Symposium on Theory of Computing (STOC ’14), ACM, pp. 174183.Google Scholar
Li, Y., Yang, H. and Ying, L. (2018), ‘Multidimensional butterfly factorization’, Appl. Comput. Harmon. Anal. 44, 737758.CrossRefGoogle Scholar
Li, Y., Yang, H., Martin, E. R., Ho, K. L. and Ying, L. (2015 b), ‘Butterfly factorization’, Multiscale Model. Simul. 13, 714732.Google Scholar
Liberty, E. (2009), Accelerated dense random projections. PhD thesis, Computer Science, Yale University.Google Scholar
Liberty, E., Woolfe, F., Martinsson, P.-G., Rokhlin, V. and Tygert, M. (2007), ‘Randomized algorithms for the low-rank approximation of matrices’, Proc. Nat. Acad. Sci. USA 104, 2016720172.CrossRefGoogle ScholarPubMed
Lim, L.-H. and Weare, J. (2017), ‘Fast randomized iteration: Diffusion Monte Carlo through the lens of numerical linear algebra’, SIAM Review 59, 547587.CrossRefGoogle Scholar
Lin, L., Lu, J. and Ying, L. (2011), ‘Fast construction of hierarchical matrix representation from matrix–vector multiplication’, J. Comput. Phys. 230, 40714087.Google Scholar
Linial, N., London, E. and Rabinovich, Y. (1995), ‘The geometry of graphs and some of its algorithmic applications’, Combinatorica 15, 215245.CrossRefGoogle Scholar
Lopes, M. E. (2019), ‘Estimating the algorithmic variance of randomized ensembles via the bootstrap’, Ann. Statist. 47, 10881112.Google Scholar
Lopez-Paz, D., Sra, S., Smola, A., Ghahramani, Z. and Schölkopf, B. (2014), Randomized nonlinear component analysis. In 31st International Conference on Machine Learning (Xing, E. P. and Jebara, T., eds), Vol. 32 of Proceedings of Machine Learning Research, PMLR, pp. 13591367.Google Scholar
Lust-Piquard, F. (1986), ‘Inégalités de Khintchine dans ${C}_p\left(1\lt p\lt \infty \right)$ ’, C.R. Acad. Sci. Paris Sér. I Math. 303, 289292.Google Scholar
Mackey, L., Jordan, M. I., Chen, R. Y., Farrell, B. and Tropp, J. A. (2014), ‘Matrix concentration inequalities via the method of exchangeable pairs’, Ann. Probab. 42, 906945.CrossRefGoogle Scholar
Mahoney, M. W. (2011), ‘Randomized algorithms for matrices and data’, Found. Trends Mach. Learn. 3, 123224.Google Scholar
Mahoney, M. W. and Drineas, P. (2009), ‘CUR matrix decompositions for improved data analysis’, Proc. Nat. Acad. Sci. USA 106, 697702.CrossRefGoogle ScholarPubMed
Malik, O. A. and Becker, S. (2019), Guarantees for the Kronecker fast Johnson–Lindenstrauss transform using a coherence and sampling argument. arXiv:1911.08424 Google Scholar
March, W. B., Xiao, B. and Biros, G. (2015), ‘ASKIT: Approximate skeletonization kernel-independent treecode in high dimensions’, SIAM J. Sci. Comput. 37, A1089A1110.CrossRefGoogle Scholar
Marcus, M. B. and Pisier, G. (1981), Random Fourier Series with Applications to Harmonic Analysis, Vol. 101 of Annals of Mathematics Studies, Princeton University Press and University of Tokyo Press.Google Scholar
Martinsson, P.-G. (2008), Rapid factorization of structured matrices via randomized sampling. arXiv:0806.2339 Google Scholar
Martinsson, P.-G. (2011), ‘A fast randomized algorithm for computing a hierarchically semiseparable representation of a matrix’, SIAM J. Matrix Anal. Appl. 32, 12511274.CrossRefGoogle Scholar
Martinsson, P.-G. (2015), Blocked rank-revealing QR factorizations: How randomized sampling can be used to avoid single-vector pivoting. arXiv:1505.08115 Google Scholar
Martinsson, P.-G. (2016), ‘Compressing rank-structured matrices via randomized sampling’, SIAM J. Sci. Comput. 38, A1959A1986.CrossRefGoogle Scholar
Martinsson, P.-G. (2018), Randomized methods for matrix computations. In The Mathematics of Data (Mahoney, M. W., Duchi, J. and Gilbert, A., eds), Vol. 25 of IAS/Park City Mathematics Series, AMS, pp. 187231.CrossRefGoogle Scholar
Martinsson, P.-G. (2019), Fast Direct Solvers for Elliptic PDEs, Vol. CB96 of CBMS-NSF Conference Series, SIAM.CrossRefGoogle Scholar
Martinsson, P.-G. and Rokhlin, V. (2005), ‘A fast direct solver for boundary integral equations in two dimensions’, J. Comput. Phys. 205, 123.CrossRefGoogle Scholar
Martinsson, P.-G. and Tropp, J. (2020), Randomized numerical linear algebra: Foundations and algorithms. arXiv:2002.01387 Google Scholar
Martinsson, P.-G. and Voronin, S. (2016), ‘A randomized blocked algorithm for efficiently computing rank-revealing factorizations of matrices’, SIAM J. Sci. Comput. 38, S485S507.Google Scholar
Martinsson, P.-G., Rokhlin, V. and Tygert, M. (2006 a), A randomized algorithm for the approximation of matrices. Yale CS research report YALEU/DCS/RR-1361, Computer Science Department, Yale University.Google Scholar
Martinsson, P.-G., Rokhlin, V. and Tygert, M. (2006 b), ‘On interpolation and integration in finite-dimensional spaces of bounded functions’, Comm. Appl. Math. Comput. Sci. 1, pp. 133142.CrossRefGoogle Scholar
Martinsson, P.-G., Orti, G. Quintana and Heavner, N. (2019), randUTV: A blocked randomized algorithm for computing a rank-revealing UTV factorization. ACM Trans. Math. Software 45, 4.CrossRefGoogle Scholar
Martinsson, P.-G., Quintana-Ortí, G., Heavner, N. and van de Geijn, R. (2015), Householder QR factorization with randomization for column pivoting (HQRRP). arXiv:1512.02671 Google Scholar
Martinsson, P.-G., Quintana-Ortí, G., Heavner, N. and van de Geijn, R. (2017), ‘Householder QR factorization with randomization for column pivoting (HQRRP)’, SIAM J. Sci. Comput. 39, C96C115.CrossRefGoogle Scholar
Mathias, R. and Stewart, G. W. (1993), ‘A block {QR} algorithm and the singular value decomposition’, Linear Algebra Appl. 182, 91100.CrossRefGoogle Scholar
McCoy, M. B. and Tropp, J. A. (2013), The achievable performance of convex demixing. ACM Technical Report 2017-02, Caltech.Google Scholar
McCoy, M. B. and Tropp, J. A. (2014), ‘From Steiner formulas for cones to concentration of intrinsic volumes’, Discrete Comput. Geom. 51, 926963.Google Scholar
Melgaard, C. and Gu, M. (2015), Gaussian elimination with randomized complete pivoting. arXiv:1511.08528 Google Scholar
Mendelson, S., Rauhut, H. and Ward, R. (2018), ‘Improved bounds for sparse recovery from subsampled random convolutions’, Ann. Appl. Probab. 28, 34913527.CrossRefGoogle Scholar
Meng, X. and Mahoney, M. W. (2013), Low-distortion subspace embeddings in input-sparsity time and applications to robust linear regression. In 2013 ACM Symposium on Theory of Computing (STOC ’13), ACM, pp. 91100.Google Scholar
Meng, X., Saunders, M. A. and Mahoney, M. W. (2014), ‘LSRN: A parallel iterative solver for strongly over- or underdetermined systems’, SIAM J. Sci. Comput. 36, C95C118.CrossRefGoogle ScholarPubMed
Mezzadri, F. (2007), ‘How to generate random matrices from the classical compact groups’, Notices Amer. Math. Soc. 54, 592604.Google Scholar
Moulines, E. and Bach, F. R. (2011), Non-asymptotic analysis of stochastic approximation algorithms for machine learning. In Advances in Neural Information Processing Systems 24 (Shawe-Taylor, J. et al., eds), Curran Associates, pp. 451459.Google Scholar
Muirhead, R. J. (1982), Aspects of Multivariate Statistical Theory, Wiley Series in Probability and Mathematical Statistics, Wiley.Google Scholar
Musco, C. and Musco, C. (2015), Randomized block Krylov methods for stronger and faster approximate singular value decomposition. In Advances in Neural Information Processing Systems 28 (Cortes, C. et al., eds), Curran Associates, pp. 13961404.Google Scholar
Musco, C. and Musco, C. (2017), Recursive sampling for the Nyström method. In Advances in Neural Information Processing Systems 30 (Guyon, I. et al., eds), Curran Associates, pp. 38333845.Google Scholar
Musco, C., Musco, C. and Sidford, A. (2018), Stability of the Lanczos method for matrix function approximation. In 29th Annual ACM–SIAM Symposium on Discrete Algorithms, SIAM, pp. 16051624.Google Scholar
Muthukrishnan, S. (2005), ‘Data streams: Algorithms and applications’, Found. Trends Theor. Comput. Sci. 1, 117236.Google Scholar
Neal, R. M. (1996), Priors for Infinite Networks, Springer, pp. 2953.Google Scholar
Needell, D. and Tropp, J. A. (2014), ‘Paved with good intentions: Analysis of a randomized block Kaczmarz method’, Linear Algebra Appl. 441, 199221.CrossRefGoogle Scholar
Needell, D., Ward, R. and Srebro, N. (2014), Stochastic gradient descent, weighted sampling, and the randomized Kaczmarz algorithm. In Advances in Neural Information Processing Systems 27 (Ghahramani, Z. et al., eds), Curran Associates, pp. 10171025.Google Scholar
Nelson, J. and Nguyen, H. L. (2013), OSNAP: Faster numerical linear algebra algorithms via sparser subspace embeddings. In 2013 IEEE 54th Annual Symposium on Foundations of Computer Science (FOCS 2013), IEEE, pp. 117126.CrossRefGoogle Scholar
The Numerical Algorithms Group (NAG) (2019), The NAG Library Mark 27. https://www.nag.com/content/naglibrary-mark27 Google Scholar
Nyström, E. J. (1930), ‘Über Die Praktische Auflösung von Integralgleichungen mit Anwendungen auf Randwertaufgaben’, Acta Math. 54, 185204.Google Scholar
Oja, E. (1982), ‘A simplified neuron model as a principal component analyzer’, J. Math. Biol. 15, 267273.CrossRefGoogle ScholarPubMed
Oliveira, R. I. (2009 a), Concentration of the adjacency matrix and of the Laplacian in random graphs with independent edges. arXiv:0911.0600 Google Scholar
Oliveira, R. I. (2009 b), ‘On the convergence to equilibrium of Kac’s random walk on matrices’, Ann. Appl. Probab. 19, 12001231.CrossRefGoogle Scholar
O’Neil, M. (2007), A new class of analysis-based fast transforms. PhD thesis, Mathematics, Yale University.Google Scholar
Oymak, S. and Tropp, J. A. (2018), ‘Universality laws for randomized dimension reduction, with applications’, Inf. Inference 7, 337446.Google Scholar
Pagh, R. (2013), ‘Compressed matrix multiplication’, ACM Trans. Comput. Theory 5, 9.CrossRefGoogle Scholar
Pan, V. Y. and Zhao, L. (2017), ‘Numerically safe Gaussian elimination with no pivoting’, Linear Algebra Appl. 527, 349383.CrossRefGoogle Scholar
Papadimitriou, C. H., Raghavan, P., Tamaki, H. and Vempala, S. (2000), ‘Latent semantic indexing: A probabilistic analysis’, J. Comput. System Sci. 61, 217235.CrossRefGoogle Scholar
Park, H. and Eldén, L. (1995), ‘Downdating the rank-revealing URV decomposition’, SIAM J. Matrix Anal. Appl. 16, 138155.CrossRefGoogle Scholar
Parker, D. S. (1995), Random butterfly transformations with applications in computational linear algebra. Report CSD-950023, UCLA.Google Scholar
Parlett, B. N. (1998), The Symmetric Eigenvalue Problem, corrected reprint of the 1980 original, Vol. 20 of Classics in Applied Mathematics, SIAM.Google Scholar
Pham, N. and Pagh, R. (2013), Fast and scalable polynomial kernels via explicit feature maps. In 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’13), ACM, pp. 239247.CrossRefGoogle Scholar
Pilanci, M. and Wainwright, M. J. (2015), ‘Randomized sketches of convex programs with sharp guarantees’, IEEE Trans. Inform. Theory 61, 50965115.CrossRefGoogle Scholar
Pilanci, M. and Wainwright, M. J. (2016), ‘Iterative Hessian sketch: Fast and accurate solution approximation for constrained least-squares’, J. Mach. Learn. Res. 17, Paper No. 53, 38.Google Scholar
Pillai, N. S. and Smith, A. (2017), ‘Kac’s walk on $n$ -sphere mixes in $n\log n$ steps’, Ann. Appl. Probab. 27, 631650.Google Scholar
Porod, U. (1996), ‘The cut-off phenomenon for random reflections’, Ann. Probab. 24, 7496.Google Scholar
Pourkamali-Anaraki, F. and Becker, S. (2019), ‘Improved fixed-rank Nyström approximation via QR decomposition: Practical and theoretical aspects’, Neurocomput. 363, 261272.CrossRefGoogle Scholar
Rahimi, A. and Recht, B. (2008), Random features for large-scale kernel machines. In Advances in Neural Information Processing Systems 20 (Platt, J. C. et al., eds), Curran Associates, pp. 11771184.Google Scholar
Rahimi, A. and Recht, B. (2009), Weighted sums of random kitchen sinks: Replacing minimization with randomization in learning. In Advances in Neural Information Processing Systems 21 (Koller, D. et al., eds), Curran Associates, pp. 13131320.Google Scholar
Rauhut, H. and Ward, R. (2012), ‘Sparse Legendre expansions via ${\ell}_1$ -minimization’, J. Approx. Theory 164, 517533.CrossRefGoogle Scholar
Rauhut, H. and Ward, R. (2016), ‘Interpolation via weighted ${\ell}_1$ minimization’, Appl. Comput. Harmon. Anal. 40, 321351.CrossRefGoogle Scholar
Rauhut, H., Romberg, J. and Tropp, J. A. (2012), ‘Restricted isometries for partial random circulant matrices’, Appl. Comput. Harmon. Anal. 32, 242254.CrossRefGoogle Scholar
Richtárik, P. and Takáč, M. (2020), ‘Stochastic reformulations of linear systems: Algorithms and convergence theory’, SIAM J. Matrix Anal. Appl. o41, 487524.Google Scholar
Rokhlin, V. and Tygert, M. (2008), ‘A fast randomized algorithm for overdetermined linear least-squares regression’, Proc. Nat. Acad. Sci. USA 105, 1321213217.CrossRefGoogle ScholarPubMed
Rokhlin, V., Szlam, A. and Tygert, M. (2009), ‘A randomized algorithm for principal component analysis’, SIAM J. Matrix Anal. Appl. 31, 11001124.Google Scholar
Rosenthal, J. S. (1994), ‘Random rotations: Characters and random walks on $\mathrm{SO}(N)$ ’, Ann. Probab. 22, 398423.CrossRefGoogle Scholar
Ross, N. (2011), ‘Fundamentals of Stein’s method’, Probab. Surv. 8, 210293.Google Scholar
Rudelson, M. (1999), ‘Random vectors in the isotropic position’, J. Funct. Anal. 164, 6072.CrossRefGoogle Scholar
Rudelson, M. (2012), ‘Row products of random matrices’, Adv. Math. 231, 31993231.CrossRefGoogle Scholar
Rudelson, M. and Vershynin, R. (2007), ‘Sampling from large matrices: An approach through geometric functional analysis’, J. Assoc. Comput. Mach. 54, 21.CrossRefGoogle Scholar
Rudelson, M. and Vershynin, R. (2008), ‘On sparse reconstruction from Fourier and Gaussian measurements’, Comm. Pure Appl. Math. 61, 10251045.CrossRefGoogle Scholar
Rudi, A. and Rosasco, L. (2017), Generalization properties of learning with random features. In Advances in Neural Information Processing Systems 30 (Guyon, I. et al., eds), Curran Associates, pp. 32153225.Google Scholar
Rudi, A., Calandriello, D., Carratino, L. and Rosasco, L. (2018), On fast leverage score sampling and optimal learning. In Advances in Neural Information Processing Systems 31 (Bengio, S. et al., eds), Curran Associates, pp. 56725682.Google Scholar
Rudi, A., Camoriano, R. and Rosasco, L. (2015), Less is more: Nyström computational regularization. In Advances in Neural Information Processing Systems 28 (Cortes, C. et al., eds), Curran Associates, pp. 16571665.Google Scholar
Rudi, A., Carratino, L. and Rosasco, L. (2017), FALKON: An optimal large scale kernel method. In Advances in Neural Information Processing Systems 30 (Guyon, I. et al., eds), Curran Associates, pp. 38883898.Google Scholar
Samo, Y.-L. K. and Roberts, S. (2015), Generalized spectral kernels. arXiv:1506.02236 Google Scholar
Sankar, A., Spielman, D. A. and Teng, S.-H. (2006), ‘Smoothed analysis of the condition numbers and growth factors of matrices’, SIAM J. Matrix Anal. Appl. 28, 446476.CrossRefGoogle Scholar
Sarlós, T. (2006), Improved approximation algorithms for large matrices via random projections. In 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS ’06), pp. 143152.CrossRefGoogle Scholar
Schoenberg, I. J. (1942), ‘Positive definite functions on spheres’, Duke Math. J. 9, 96108.CrossRefGoogle Scholar
Schölkopf, B. and Smola, A. (2001), Learning with Kernels, MIT Press.Google Scholar
Schölkopf, B., Smola, A. and Müller, K.-R. (1996), Nonlinear component analysis as a kernel eigenvalue problem. Technical report 44, Max-Planck-Institut für biologische Kybernetik.Google Scholar
Simchowitz, M., Alaoui, A. E. and Recht, B. (2017), On the gap between strict-saddles and true convexity: An Omega $(\log d)$ lower bound for eigenvector approximation. arXiv:1704.04548 Google Scholar
Sloane, N. J. A. (1983), Encrypting by random rotations. In Cryptography (Burg Feuerstein, 1982), Vol. 149 of Lecture Notes in Computer Science, Springer, pp. 71128.Google Scholar
Sorensen, D. C. and Embree, M. (2016), ‘A DEIM induced CUR factorization’, SIAM J. Sci. Comput. 38, A1454A1482.CrossRefGoogle Scholar
Spielman, D. A. and Srivastava, N. (2011), ‘Graph sparsification by effective resistances’, SIAM J. Comput. 40, 19131926.CrossRefGoogle Scholar
Sriperumbudur, B. and Szabó, Z. (2015), Optimal rates for random Fourier features. In Advances in Neural Information Processing Systems 28 (Cortes, C. et al., eds), Curran Associates, pp. 11441152.Google Scholar
Stewart, G. W. (1994), UTV decompositions. In Numerical Analysis 1993, Vol. 303 of Pitman Research Notes in Mathematics Series, Longman Scientific and Technical, pp. 225225.Google Scholar
Stewart, G. W. (1998), Matrix Algorithms , Vol. 1: Basic Decompositions, SIAM.Google Scholar
Stewart, G. W. (1999), ‘The QLP approximation to the singular value decomposition’, SIAM J. Sci. Comput. 20, 13361348.Google Scholar
Stewart, G. W. (2001), Matrix Algorithms , Vol. 2: Eigensystems, SIAM.Google Scholar
Stojnic, M. (2010), ${\ell}_1$ optimization and its various thresholds in compressed sensing. In 2010 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 39103913.CrossRefGoogle Scholar
Strang, G. (2019), Linear Algebra and Learning from Data, Wellesley-Cambridge Press.Google Scholar
Strassen, V. (1969), ‘Gaussian elimination is not optimal’, Numer. Math. 13, 354356.CrossRefGoogle Scholar
Strohmer, T. and Vershynin, R. (2009), ‘A randomized Kaczmarz algorithm with exponential convergence’, J. Fourier Anal. Appl. 15, 262278.CrossRefGoogle Scholar
Sun, Y., Guo, Y., Tropp, J. A. and Udell, M. (2018), Tensor random projection for low memory dimension reduction. In 32nd Conference on Neural Information Processing Systems, Montréal, Canada .Google Scholar
Szabó, Z. and Sriperumbudur, B. (2019), On kernel derivative approximation with random Fourier features. In 22nd International Conference on Artificial Intelligence and Statistics (Chaudhuri, K. and Sugiyama, M., eds), Vol. 89 of Proceedings of Machine Learning Research, PMLR, pp. 827836.Google Scholar
Teng, S.-H. (2010), The Laplacian paradigm: Emerging algorithms for massive graphs. In Theory and Applications of Models of Computation, Vol. 6108 of Lecture Notes in Computer Science, Springer, pp. 214.CrossRefGoogle Scholar
Thrampoulidis, C. and Hassibi, B. (2015), Isotropically random orthogonal matrices: Performance of LASSO and minimum conic singular values. In 2015 IEEE International Symposium on Information Theory (ISIT), pp. 556560.CrossRefGoogle Scholar
Thrampoulidis, C., Oymak, S. and Hassibi, B. (2014), The Gaussian min-max theorem in the presence of convexity. arXiv:1408.4837 Google Scholar
Thurau, C., Kersting, K. and Bauckhage, C. (2012), Deterministic CUR for improved large-scale data analysis: An empirical study. In 2012 SIAM International Conference on Data Mining, SIAM, pp. 684695.Google Scholar
Tomczak-Jaegermann, N. (1974), ‘The moduli of smoothness and convexity and the Rademacher averages of trace classes ${S}_p\left(1\le p<\infty \right)$ ’, Studia Math. 50, 163182.Google Scholar
Ton, J.-F., Flaxman, S., Sejdinovic, D. and Bhatt, S. (2018), ‘Spatial mapping with Gaussian processes and nonstationary Fourier features’, Spat. Statist. 28, 5978.CrossRefGoogle ScholarPubMed
Trefethen, L. N. and Bau III, D. (1997), Numerical Linear Algebra, SIAM.CrossRefGoogle Scholar
Trogdon, T. (2017), ‘On spectral and numerical properties of random butterfly matrices’, Appl. Math. Lett. 95, 4858.CrossRefGoogle Scholar
Tropp, J. A. (2011 a), ‘Freedman’s inequality for matrix martingales’, Electron. Commun. Probab. 16, 262270.CrossRefGoogle Scholar
Tropp, J. A. (2011 b), ‘Improved analysis of the subsampled randomized Hadamard transform’, Adv. Adapt. Data Anal. 3, 115126.CrossRefGoogle Scholar
Tropp, J. A. (2012 a), ‘A comparison principle for functions of a uniformly random subspace’, Probab. Theory Related Fields 153, 759769.CrossRefGoogle Scholar
Tropp, J. A. (2012 b), ‘User-friendly tail bounds for sums of random matrices’, Found. Comput. Math. 12, 389434.CrossRefGoogle Scholar
Tropp, J. A. (2015), ‘An introduction to matrix concentration inequalities’, Found. Trends Mach. Learn. 8, 1230.CrossRefGoogle Scholar
Tropp, J. A. (2016), The expected norm of a sum of independent random matrices: An elementary approach. In High Dimensional Probability VII, Vol. 71 of Progress in Probability, Springer, pp. 173202.CrossRefGoogle Scholar
Tropp, J. A. (2018), Analysis of randomized block Krylov methods. Under revision.Google Scholar
Tropp, J. A. (2019), Matrix concentration and computational linear algebra. CMS Lecture Notes 2019-01, Caltech, Pasadena, CA.Google Scholar
Tropp, J. A., Wakin, M. B., Duarte, M. F., Baron, D. and Baraniuk, R. G. (2006), Random filters for compressive sampling and reconstruction. In 2006 IEEE International Conference on Acoustics, Speech, and Signal Processing Proceedings, Vol. 3.CrossRefGoogle Scholar
Tropp, J. A., Yurtsever, A., Udell, M. and Cevher, V. (2017 a), Fixed-rank approximation of a positive-semidefinite matrix from streaming data. In Advances in Neural Information Processing Systems 30 (Guyon, I. et al., eds), Curran Associates, pp. 12251234.Google Scholar
Tropp, J. A., Yurtsever, A., Udell, M. and Cevher, V. (2017 b), ‘Practical sketching algorithms for low-rank matrix approximation’, SIAM J. Matrix Anal. Appl. 38, 14541485.CrossRefGoogle Scholar
Tropp, J. A., Yurtsever, A., Udell, M. and Cevher, V. (2019), ‘Streaming low-rank matrix approximation with an application to scientific simulation’, SIAM J. Sci. Comput. 41, A2430A2463.CrossRefGoogle Scholar
Ubaru, S., Chen, J. and Saad, Y. (2017), ‘Fast estimation of $\mathsf{tr}\left(f(A)\right)$ via stochastic Lanczos quadrature’, SIAM J. Matrix Anal. Appl. 38, 10751099.CrossRefGoogle Scholar
Ullah, E., Mianjy, P., Marinov, T. V. and Arora, R. (2018), Streaming kernel PCA with $\widetilde{O}\left(\sqrt{n}\right)$ random features. In Advances in Neural Information Processing Systems 31 (Bengio, S. et al., eds), Curran Associates, pp. 73117321.Google Scholar
Upadhyay, J. (2016), Fast and space-optimal low-rank factorization in the streaming model with application in differential privacy. arXiv:1604.01429 Google Scholar
Urano, Y. (2013), A fast randomized algorithm for linear least-squares regression via sparse transforms. Master’s thesis, New York University.Google Scholar
Van Handel, R. (2016), Probability in high dimension. APC 550 Lecture Notes, Princeton University.Google Scholar
Vershynin, R. (2018), High-Dimensional Probability: An Introduction with Applications in Data Science, Vol. 47 of Cambridge Series in Statistical and Probabilistic Mathematics, Cambridge University Press.CrossRefGoogle Scholar
Vershynin, R. (2019), Concentration inequalities for random tensors. arXiv:1905.00802 Google Scholar
Voronin, S. and Martinsson, P.-G. (2017), ‘Efficient algorithms for CUR and interpolative matrix decompositions’, Adv. Comput. Math. 43, 495516.CrossRefGoogle Scholar
Waldron, S. F. D. (2018), An Introduction to Finite Tight Frames, Applied and Numerical Harmonic Analysis, Birkhäuser/Springer.CrossRefGoogle Scholar
Wang, S. (2019), Simple and almost assumption-free out-of-sample bound for random feature mapping. arXiv:1909.11207 Google Scholar
Wang, S., Gittens, A. and Mahoney, M. W. (2019), ‘Scalable kernel $k$ -means clustering with Nyström approximation: Relative-error bounds’, J. Mach. Learn. Res. 20, 431479.Google Scholar
Whaley, R. C. and Dongarra, J. J. (1998), Automatically tuned linear algebra software. In 1998 ACM/IEEE Conference on Supercomputing (SC ’98), IEEE, pp. 127.Google Scholar
Williams, C. K. I. and Seeger, M. (2001), Using the Nyström method to speed up kernel machines. In Advances in Neural Information Processing Systems 13 (Leen, T. K., Dietterich, T. G. and Tresp, V., eds), MIT Press, pp. 682688.Google Scholar
Woodruff, D. P. (2014), ‘Sketching as a tool for numerical linear algebra’, Found. Trends Theor. Comput. Sci. 10, 1157.Google Scholar
Woolfe, F., Liberty, E., Rokhlin, V. and Tygert, M. (2008), ‘A fast randomized algorithm for the approximation of matrices’, Appl. Comput. Harmon. Anal. 25, 335366.CrossRefGoogle Scholar
Wootters, W. K. and Fields, B. D. (1989), ‘Optimal state-determination by mutually unbiased measurements’, Ann. Phys. 191, 363381.CrossRefGoogle Scholar
Xia, J. (2013), ‘Randomized sparse direct solvers’, SIAM J. Matrix Anal. Appl. 34, 197227.CrossRefGoogle Scholar
Xiao, J., Gu, M. and Langou, J. (2017), Fast parallel randomized QR with column pivoting algorithms for reliable low-rank matrix approximations. In 2017 IEEE 24th International Conference on High Performance Computing (HiPC), IEEE, pp. 233242.Google Scholar
Yu, C. D., Levitt, J., Reiz, S. and Biros, G. (2017 a), Geometry-oblivious FMM for compressing dense SPD matrices. In International Conference for High Performance Computing, Networking, Storage and Analysis (SC ’17), ACM, art. 53.Google Scholar
Yu, W., Gu, Y., Li, J., Liu, S. and Li, Y. (2017 b), Single-pass PCA of large high-dimensional data. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence (IJCAI-17), pp. 33503356.Google Scholar
Yuan, Q., Gu, M. and Li, B. (2018), Superlinear convergence of randomized block Lanczos algorithm. In 2018 IEEE International Conference on Data Mining (ICDM), pp. 14041409.CrossRefGoogle Scholar
Zhang, F., ed. (2005), The Schur Complement and its Applications , Vol. 4 of Numerical Methods and Algorithms, Springer.CrossRefGoogle Scholar
Zouzias, A. (2013), Randomized primitives for linear algebra and applications. PhD thesis, University of Toronto.Google Scholar