A continuum limit for the PageRank algorithm

A. YUAN; J. CALDER; B. OSTING

doi:10.1017/S0956792521000097

A continuum limit for the PageRank algorithm

Part of: Representations of solutions Elliptic equations and systems

Published online by Cambridge University Press: 27 April 2021

and

A. YUAN: Affiliation:
Department of Mathematics, University of Minnesota, Minneapolis, MN55455, USA emails: [email protected]; [email protected]
J. CALDER: Affiliation:
Department of Mathematics, University of Minnesota, Minneapolis, MN55455, USA emails: [email protected]; [email protected]
B. OSTING: Affiliation:
Department of Mathematics, University of Utah, Salt Lake City, UT84112, USA email: [email protected]

Article contents

Abstract
References

Rights & Permissions

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

Semi-supervised and unsupervised machine learning methods often rely on graphs to model data, prompting research on how theoretical properties of operators on graphs are leveraged in learning problems. While most of the existing literature focuses on undirected graphs, directed graphs are very important in practice, giving models for physical, biological or transportation networks, among many other applications. In this paper, we propose a new framework for rigorously studying continuum limits of learning algorithms on directed graphs. We use the new framework to study the PageRank algorithm and show how it can be interpreted as a numerical scheme on a directed graph involving a type of normalised graph Laplacian. We show that the corresponding continuum limit problem, which is taken as the number of webpages grows to infinity, is a second-order, possibly degenerate, elliptic equation that contains reaction, diffusion and advection terms. We prove that the numerical scheme is consistent and stable and compute explicit rates of convergence of the discrete solution to the solution of the continuum limit partial differential equation. We give applications to proving stability and asymptotic regularity of the PageRank vector. Finally, we illustrate our results with numerical experiments and explore an application to data depth.

Keywords

Partial differential equations on graphs and networks second-order elliptic equations viscosity solutions

MSC classification

Primary: 35J15: Second-order elliptic equations

Secondary: 35D40: Viscosity solutions

Type: Papers
Information: European Journal of Applied Mathematics , Volume 33 , Issue 3 , June 2022 , pp. 472 - 504

DOI: https://doi.org/10.1017/S0956792521000097 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright: © The Author(s), 2021. Published by Cambridge University Press

References

Ando, R. K. & Zhang, T. (2007) Learning on graph with Laplacian regularization. In: Advances in Neural Information Processing Systems, Vol. 19, p. 25.Google Scholar

Belkin, M. & Niyogi, P. (2002) Laplacian eigenmaps and spectral techniques for embedding and clustering. In: Advances in Neural Information Processing Systems, pp. 585–591.Google Scholar

Belkin, M. & Niyogi, P. (2003) Using manifold structure for partially labeled classification. In: Advances in Neural Information Processing Systems, pp. 953–960.Google Scholar

Belkin, M. & Niyogi, P. (2005) Towards a theoretical foundation for Laplacian-based manifold methods. In: International Conference on Computational Learning Theory, Springer, pp. 486–500.Google Scholar

Belkin, M. & Niyogi, P. (2007) Convergence of Laplacian eigenmaps. In: Advances in Neural Information Processing Systems, pp. 129–136.Google Scholar

Bertozzi, A. L. & Flenner, A. (2012) Diffuse interface models on graphs for classification of high dimensional data. Multiscale Model. Simul. 10(3), 1090–1118.CrossRef Google Scholar

Bousquet, O., Chapelle, O. & Hein, M. (2004) Measure based regularization. In: Advances in Neural Information Processing Systems, pp. 1221–1228.Google Scholar

Burago, D., Ivanov, S. & Kurylev, Y. (2014) A graph discretization of the Laplace-Beltrami operator. J. Spectr. Theory 4(4), 675–714.CrossRef Google Scholar

Calder, J. (2018) The game theoretic p-Laplacian and semi-supervised learning with few labels. Nonlinearity 32(1), 301.Google Scholar

Calder, J. (2018) Lecture notes on viscosity solutions. Online Lecture Notes: http://www-users.math.umn.edu/jwcalder/viscosity_solutions.pdf.Google Scholar

Calder, J. (2019) Consistency of Lipschitz learning with infinite unlabeled data and finite labeled data. SIAM J. Math. Data Sci. 1(4), 780–812.CrossRef Google Scholar

Calder, J., Esedoglu, S. & Hero, A. O. A Hamilton–Jacobi equation for the continuum limit of nondominated sorting. SIAM J. Math. Anal. 46(1), 603–638 (2014).CrossRef Google Scholar

Calder, J. & García Trillos, N. (2019) Improved spectral convergence rates for graph Laplacians on ε-graphs and k-NN graphs. arXiv preprint.Google Scholar

Calder, J. & Slepčev, D. (2019) Properly-weighted graph Laplacian for semi-supervised learning. Appl. Math. Optim. Spec. Issue Optim. Data Sci. 82, 1111–1159.CrossRef Google Scholar

Calder, J., Slepčev, D. & Thorpe, M. (2020) Rates of convergence for Laplacian semi-supervised learning with low labeling rates. arXiv:2006.02765.Google Scholar

Calder, J. & Smart, C. K. (2018) The limit shape of convex hull peeling. arXiv preprint arXiv:1805.08278.Google Scholar

Coifman, R. R. & Lafon, S. (2006) Diffusion maps. Appl. Comput. Harmon. Anal. 21(1), 5–30.CrossRef Google Scholar

Crandall, M. G., Ishii, H. & Lions, P.-L. (1992) User’s guide to viscosity solutions of second order partial differential equations. Bull. Am. Math. Soc. 27(1), 1–67.Google Scholar

Dunson, D. B., Wu, H.-T. & Wu, N. (2019) Diffusion based Gaussian process regression via heat kernel reconstruction. arXiv preprint arXiv:1912.05680.Google Scholar

El Alaoui, A., Cheng, X., Ramdas, A., Wainwright, M. J. & Jordan, M. I. (2016) Asymptotic behavior of l_p-based Laplacian regularization in semi-supervised learning. In: Conference on Learning Theory, pp. 879–906.Google Scholar

Evans, L. (2010) Partial Differential Equations . Graduate Studies in Mathematics, American Mathematical Society.CrossRef Google Scholar

Flores, M., Calder, J. & Lerman, G. (2018) Algorithms for lp-based semi-supervised learning on graphs. arXiv preprint.Google Scholar

Garcia-Cardona, C., Merkurjev, E., Bertozzi, A. L., Flenner, A. & Percus, A. G. (2014) Multiclass data segmentation using diffuse interface methods on graphs. IEEE Trans. Pattern Anal. Mach. Intell. 36(8), 1600–1613.CrossRef Google Scholar PubMed

García Trillos, N. (2019) Variational limits of k-NN graph-based functionals on data clouds. SIAM J. Math. Data Sci. 1(1), 93–120.CrossRef Google Scholar

García Trillos, N., Gerlach, M., Hein, M. & Slepčev, D. (2020) Error estimates for spectral convergence of the graph Laplacian on random geometric graphs toward the Laplace–Beltrami operator. Found. Comput. Math. 20(4), 827–887.CrossRef Google Scholar

García Trillos, N. & Murray, R. W. (2020) A maximum principle argument for the uniform convergence of graph Laplacian regressors. SIAM J. Math. Data Sci. 2(3), 705–739.CrossRef Google Scholar

García Trillos, N. & Slepčev, D. (2018) A variational approach to the consistency of spectral clustering. Appl. Comput. Harm. Anal. 45(2), 239–281.CrossRef Google Scholar

García Trillos, N., Slepčev, D., Von Brecht, J., Laurent, T. & Bresson, X. (2016) Consistency of Cheeger and ratio graph cuts. J. Mach. Learn. Res. 17(1), 6268–6313.Google Scholar

Gilbarg, D. & Trudinger, N. (2001) Elliptic Partial Differential Equations of Second Order . Classics in Mathematics, U.S. Government Printing Office.CrossRef Google Scholar

Gleich, D. F. (2015) PageRank beyond the web. SIAM Rev. 57(3), 321–363.CrossRef Google Scholar

Haveliwala, T. & Kamvar, S. (2003) The Second Eigenvalue of the Google Matrix. Technical report, Stanford.Google Scholar

He, J., Li, M., Zhang, H.-J., Tong, H. & Zhang, C. (2004) Manifold-ranking based image retrieval. In: Proceedings of the 12th Annual ACM International Conference on Multimedia, ACM, pp. 9–16.CrossRef Google Scholar

He, J., Li, M., Zhang, H.-J., Tong, H. & Zhang, C. (2006) Generalized manifold-ranking-based image retrieval. IEEE Trans. Image Process. 15(10), 3170–3177.CrossRef Google Scholar PubMed

Hein, M., Audibert, J.-Y. & Von Luxburg, U. (2007) Graph Laplacians and their convergence on random neighborhood graphs. J. Mach. Learn. Res. 8, 1325–1368.Google Scholar

Hein, M., Audibert, J.-Y. & Von Luxburg, U. (2005) From graphs to manifolds–weak and strong pointwise consistency of graph Laplacians. In: International Conference on Computational Learning Theory, Springer, pp. 470–485.Google Scholar

Hoffmann, F., Hosseini, B., Oberai, A. A. & Stuart, A. M. (2019) Spectral analysis of weighted Laplacians arising in data clustering. arXiv preprint arXiv:1909.06389.Google Scholar

Lafon, S. S. (2004) Diffusion Maps and Geometric Harmonics. PhD thesis, Yale University PhD dissertation.Google Scholar

Langville, A. N. & Meyer, C. D. (2004) Deeper inside PageRank. Internet Math. 1(3), 335–380.CrossRef Google Scholar

LeCun, Y., Bottou, L., Bengio, Y. & Haffner, P. (1998) Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324.CrossRef Google Scholar

Ng, A. Y., Jordan, M. I. & Weiss, Y. (2002) On spectral clustering: analysis and an algorithm. In: Advances in Neural Information Processing Systems, pp. 849–856.Google Scholar

Osting, B. & Reeb, T. H. (2017) Consistency of Dirichlet partitions. SIAM J. Math. Anal. 49(5), 4251–4274.CrossRef Google Scholar

Shi, J. & Malik, J. (2000) Normalized cuts and image segmentation. Departmental Papers (CIS), p. 107.Google Scholar

Shi, Z. (2015) Convergence of Laplacian spectra from random samples. arXiv preprint arXiv:1507.00151.Google Scholar

Shi, Z., Wang, B. & Osher, S. J. (2018) Error estimation of weighted nonlocal Laplacian on random point cloud. arXiv preprint arXiv:1809.08622.Google Scholar

Shnitzer, T., Ben-Chen, M., Guibas, L., Talmon, R. & Wu, H.-T. (2019) Recovering hidden components in multimodal data with composite diffusion operators. SIAM J. Math. Data Sci. 1(3), 588–616.CrossRef Google Scholar

Singer, A. (2006) From graph to manifold Laplacian: the convergence rate. Appl. Comput. Harmon. Anal. 21(1), 128–134.CrossRef Google Scholar

Slepčev, D. & Thorpe, M. (2019) Analysis of p-Laplacian regularization in semi-supervised learning. SIAM J. Math. Anal. 51(3), 2085–2120.CrossRef Google Scholar

Szummer, M. & Jaakkola, T. (2002) Partially labeled classification with Markov random walks. In: Advances in Neural Information Processing Systems, pp. 945–952.Google Scholar

Ting, D., Huang, L. & Jordan, M. (2010) An analysis of the convergence of graph Laplacians. In: International Conference on Machine Learning (ICML).Google Scholar

Trillos, N. G. & Slepčev, D. (2016) Continuum limit of total variation on point clouds. Arch. Ration. Mech. Anal. 220(1), 193–241.Google Scholar

Von Luxburg, U., Belkin, M. & Bousquet, O. (2008) Consistency of spectral clustering. Ann. Stat. 36(2), 555–586.CrossRef Google Scholar

Wang, Y., Cheema, M. A., Lin, X. & Zhang, Q. (2013) Multi-manifold ranking: using multiple features for better image retrieval. In: Pacific-Asia Conference on Knowledge Discovery and Data Mining, Springer, pp. 449–460.CrossRef Google Scholar

Xiao, H., Rasul, K. & Vollgraf, R. (2017) Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. arXiv:1708.07747.Google Scholar

Xu, B., Bu, J., Chen, C., Cai, D., He, X., Liu, W. & Luo, J. (2011) Efficient manifold ranking for image retrieval. In: Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM, pp. 525–534.CrossRef Google Scholar

Yang, C., Zhang, L., Lu, H., Ruan, X. & Yang, M.-H. (2013) Saliency detection via graph-based manifold ranking. In: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp. 3166–3173.CrossRef Google Scholar

Zhou, D., Bousquet, O., Lal, T. N., Weston, J. & Schölkopf, B. (2004) Learning with local and global consistency. Adv. Neural Inf. Process. Syst. 16(16), 321–328.Google Scholar

Zhou, D., Hofmann, T. & Schölkopf, B. (2005) Semi-supervised learning on directed graphs. In: Advances in Neural Information Processing Systems, pp. 1633–1640.Google Scholar

Zhou, D., Huang, J. & Schölkopf, B. (2005) Learning from labeled and unlabeled data on a directed graph. In: Proceedings of the 22nd International Conference on Machine Learning, ACM, pp. 1036–1043.CrossRef Google Scholar

Zhou, D., Weston, J., Gretton, A., Bousquet, O. & Schölkopf, B. (2004) Ranking on data manifolds. In: Advances in Neural Information Processing Systems, Vol. 16, pp. 169–176.Google Scholar

Zhou, X., Belkin, M. & Srebro, N. (2011) An iterated graph Laplacian approach for ranking on manifolds. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, pp. 877–885.CrossRef Google Scholar

Zhu, X., Ghahramani, Z. & Lafferty, J. (2003) Semi-supervised learning using Gaussian fields and harmonic functions. In: International Conference on Machine Learning, Vol. 3, pp. 912–919.Google Scholar

Zosso, D. & Osting, B. (2016) A minimal surface criterion for graph partitioning. Inverse Probl. Imaging 10(4), 1149–1180.Google Scholar

Article contents

A continuum limit for the PageRank algorithm

Abstract

Keywords

MSC classification

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests