Hostname: page-component-745bb68f8f-kw2vx Total loading time: 0 Render date: 2025-01-11T02:04:35.687Z Has data issue: false hasContentIssue false

Learning Complexity vs Communication Complexity

Published online by Cambridge University Press:  01 March 2009

NATI LINIAL
Affiliation:
School of Computer Science and Engineering, Hebrew University, Jerusalem, Israel (e-mail: [email protected])
ADI SHRAIBMAN
Affiliation:
Department of Mathematics, Weizmann Institute of Science, Rehovot, Israel (e-mail: [email protected])

Abstract

This paper has two main focal points. We first consider an important class of machine learning algorithms: large margin classifiers, such as Support Vector Machines. The notion of margin complexity quantifies the extent to which a given class of functions can be learned by large margin classifiers. We prove that up to a small multiplicative constant, margin complexity is equal to the inverse of discrepancy. This establishes a strong tie between seemingly very different notions from two distinct areas.

In the same way that matrix rigidity is related to rank, we introduce the notion of rigidity of margin complexity. We prove that sign matrices with small margin complexity rigidity are very rare. This leads to the question of proving lower bounds on the rigidity of margin complexity. Quite surprisingly, this question turns out to be closely related to basic open problems in communication complexity, e.g., whether PSPACE can be separated from the polynomial hierarchy in communication complexity.

Communication is a key ingredient in many types of learning. This explains the relations between the field of learning theory and that of communication complexity [6, l0, 16, 26]. The results of this paper constitute another link in this rich web of relations. These new results have already been applied toward the solution of several open problems in communication complexity [18, 20, 29].

Type
Paper
Copyright
Copyright © Cambridge University Press 2009

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

[1]Aho, A. V., Ullman, J. D. and Yannakakis, M. (1983) On notions of information transfer in VLSI circuits. In Proc. 15th ACM STOC, pp. 133–139.CrossRefGoogle Scholar
[2]Alon, N. (1995) Tools from higher algebra. In Handbook of Combinatorics, Vol. 1, North-Holland, pp. 17491783.Google Scholar
[3]Alon, N., Frankl, P. and Rödl, V. (1985) Geometrical realizations of set systems and probabilistic communication complexity. In Proc. 26th Symposium on Foundations of Computer Science, IEEE Computer Society Press, pp. 277280.Google Scholar
[4]Alon, N. and Naor, A. (2004) Approximating the cut-norm via Grothendieck's inequality. In Proc. 36th ACM STOC, pp. 72–80.Google Scholar
[5]Babai, L., Frankl, P. and Simon, J. (1986) Complexity classes in communication complexity. In Proc. 27th IEEE Symposium on Foundations of Computer Science, pp. 337–347.CrossRefGoogle Scholar
[6]Ben-David, S., Eiron, N. and Simon, H. U. (2002) Limitations of learning via embeddings in Euclidean half spaces. J. Machine Learning Research 3 441461.Google Scholar
[7]Chazelle, B. (2000) The Discrepancy Method: Randomness and Complexity, Cambridge University Press.CrossRefGoogle Scholar
[8]Cristianini, N. and Shawe-Taylor, J. (1999) An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods, Cambridge University Press. New York.Google Scholar
[9]Forster, J. (2001) A linear lower bound on the unbounded error probabilistic communication complexity. In SCT: Annual Conference on Structure in Complexity Theory, IEEE Computer Society Press, pp. 100106.Google Scholar
[10]Forster, J., Krause, M., Lokam, S. V., Mubarakzjanov, R., Schmitt, N. and Simon, H. U. (2001) Relations between communication complexity, linear arrangements, and computational complexity. In Proc. 21st Conference on Foundations of Software Technology and Theoretical Computer Science, pp. 171–182.CrossRefGoogle Scholar
[11]Forster, J., Schmitt, N. and Simon, H. U. (2001) Estimating the optimal margins of embeddings in Euclidean half spaces. In Proc. 14th Annual Conference on Computational Learning Theory (COLT 2001) and 5th European Conference on Computational Learning Theory (EuroCOLT 2001), Vol. 2111 of Lecture Notes in Computer Science, Springer, Berlin, pp. 402–415.CrossRefGoogle Scholar
[12]Forster, J. and Simon, H. U. (2006) On the smallest possible dimension and the largest possible margin of linear arrangements representing given concept classes. Theor. Comput. Sci. 350 4048.CrossRefGoogle Scholar
[13]Jameson, G. J. O. (1987) Summing and Nuclear Norms in Banach Space Theory, London Mathematical Society Student Texts, Cambridge University Press.CrossRefGoogle Scholar
[14]Johnson, W. B. and Lindenstrauss, J. (1984) Extensions of Lipshitz mappings into a Hilbert space. In Conference in Modern Analysis and Probability, AMS, Providence, RI, pp. 189206.CrossRefGoogle Scholar
[15]Kashin, B. and Razborov, A. (1998) Improved lower bounds on the rigidity of Hadamard matrices. Mathematical Notes 63 471475.CrossRefGoogle Scholar
[16]Kremer, I., Nisan, N. and Ron, D. (1995) On randomized one-round communication complexity. In Proc. 35th IEEE FOCS, pp. 596–605.CrossRefGoogle Scholar
[17]Kushilevitz, E. and Nisan, N. (1997) Communication Complexity, Cambridge University Press.Google Scholar
[18]Lee, T., Shraibman, A. and Špalek, R. (2008) A direct product theorem for discrepancy. In Annual IEEE Conference on Computational Complexity, pp. 71–80.CrossRefGoogle Scholar
[19]Linial, N., Mendelson, S., Schechtman, G. and Shraibman, A. (2007) Complexity measures of sign matrices. Combinatorica 27 439463.CrossRefGoogle Scholar
[20]Linial, N. and Shraibman, A. (2007) Lower bounds in communication complexity based on factorization norms. In Proc. 39th ACM STOC, pp. 699–708.CrossRefGoogle Scholar
[21]Lokam, S. V. (1995) Spectral methods for matrix rigidity with applications to size-depth tradeoffs and communication complexity. In IEEE Symposium on Foundations of Computer Science, pp. 6–15.CrossRefGoogle Scholar
[22]Matoušek, J. (1999) Geometric Discrepancy: An Illustrated Guide, Vol. 18 Algorithms and Combinatorics, Springer.CrossRefGoogle Scholar
[23]Matoušek, J. (2002) Lectures on Discrete Geometry, Vol. 212 of Graduate Texts in Mathematics, Springer.CrossRefGoogle Scholar
[24]Mendelson, S. (2005) Embeddings with a Lipschitz function. Random Struct. Alg. 27 2545.CrossRefGoogle Scholar
[25]Mendelson, S. (2005) On the limitations of embedding methods. In Proc. 18th Annual Conference on Learning Theory (COLT05), Vol. 3559 of Lecture Notes in Computer Science, Springer, pp. 353365.Google Scholar
[26]Paturi, R. and Simon, J. (1986) Probabilistic communication complexity. J. Comput. Syst. Sci. 33 106123.CrossRefGoogle Scholar
[27]Pisier, G. (1986) Factorization of Linear Operators and Geometry of Banach Spaces, Vol. 60 of CBMS Regional Conference Series in Mathematics, Published for the Conference Board of the Mathematical Sciences, Washington, DC.CrossRefGoogle Scholar
[28]Pudlák, P. and Rödl, V. (1994) Some combinatorial–algebraic problems from complexity theory. Discrete Math. 136 253279.CrossRefGoogle Scholar
[29]Sherstov, A. A. (2008) Communication complexity under product and nonproduct distributions. In Annual IEEE Conference on Computational Complexity, pp. 64–70.CrossRefGoogle Scholar
[30]Shokrollahi, M. A., Spielman, D. A. and Stemann, V. (1997) A remark on matrix rigidity. Inform. Process. Lett. 64 283285.CrossRefGoogle Scholar
[31]Tarui, J. (1993) Randomized polynomials, threshold circuits and polynomial hierarchy. Theoret. Comput. Sci. 113 167183.CrossRefGoogle Scholar
[32]Valiant, L. G. (1977) Graph-theoretic arguments in low level complexity. In Proc. 6th MFCS, Vol. 53 of Lecture Notes in Computer Science, Springer, pp. 162176.Google Scholar
[33]Vapnik, V. N. (1999) The Nature of Statistical Learning Theory, Springer, New York.Google Scholar
[34]Yao, A. (1983) Lower bounds by probabilistic arguments. In Proc. 15th ACM STOC, pp. 420–428.CrossRefGoogle Scholar