Learning Complexity vs Communication Complexity

NATI LINIAL; ADI SHRAIBMAN

doi:10.1017/S0963548308009656

Learning Complexity vs Communication Complexity

Published online by Cambridge University Press: 01 March 2009

NATI LINIAL and

ADI SHRAIBMAN

Show author details

NATI LINIAL: Affiliation:
School of Computer Science and Engineering, Hebrew University, Jerusalem, Israel (e-mail: [email protected])
ADI SHRAIBMAN: Affiliation:
Department of Mathematics, Weizmann Institute of Science, Rehovot, Israel (e-mail: [email protected])

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

This paper has two main focal points. We first consider an important class of machine learning algorithms: large margin classifiers, such as Support Vector Machines. The notion of margin complexity quantifies the extent to which a given class of functions can be learned by large margin classifiers. We prove that up to a small multiplicative constant, margin complexity is equal to the inverse of discrepancy. This establishes a strong tie between seemingly very different notions from two distinct areas.

In the same way that matrix rigidity is related to rank, we introduce the notion of rigidity of margin complexity. We prove that sign matrices with small margin complexity rigidity are very rare. This leads to the question of proving lower bounds on the rigidity of margin complexity. Quite surprisingly, this question turns out to be closely related to basic open problems in communication complexity, e.g., whether PSPACE can be separated from the polynomial hierarchy in communication complexity.

Communication is a key ingredient in many types of learning. This explains the relations between the field of learning theory and that of communication complexity [6, l0, 16, 26]. The results of this paper constitute another link in this rich web of relations. These new results have already been applied toward the solution of several open problems in communication complexity [18, 20, 29].

Type: Paper
Information: Combinatorics, Probability and Computing , Volume 18 , Issue 1-2: Papers from The 2006 Oberwolfach Meeting on Combinatorics, Probability and Computing , March 2009 , pp. 227 - 245

DOI: https://doi.org/10.1017/S0963548308009656 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2009

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

[1]Aho, A. V., Ullman, J. D. and Yannakakis, M. (1983) On notions of information transfer in VLSI circuits. In Proc. 15th ACM STOC, pp. 133–139.CrossRef Google Scholar

[2]Alon, N. (1995) Tools from higher algebra. In Handbook of Combinatorics, Vol. 1, North-Holland, pp. 1749–1783.Google Scholar

[3]Alon, N., Frankl, P. and Rödl, V. (1985) Geometrical realizations of set systems and probabilistic communication complexity. In Proc. 26th Symposium on Foundations of Computer Science, IEEE Computer Society Press, pp. 277–280.Google Scholar

[4]Alon, N. and Naor, A. (2004) Approximating the cut-norm via Grothendieck's inequality. In Proc. 36th ACM STOC, pp. 72–80.Google Scholar

[5]Babai, L., Frankl, P. and Simon, J. (1986) Complexity classes in communication complexity. In Proc. 27th IEEE Symposium on Foundations of Computer Science, pp. 337–347.CrossRef Google Scholar

[6]Ben-David, S., Eiron, N. and Simon, H. U. (2002) Limitations of learning via embeddings in Euclidean half spaces. J. Machine Learning Research 3 441–461.Google Scholar

[7]Chazelle, B. (2000) The Discrepancy Method: Randomness and Complexity, Cambridge University Press.CrossRef Google Scholar

[8]Cristianini, N. and Shawe-Taylor, J. (1999) An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods, Cambridge University Press. New York.Google Scholar

[9]Forster, J. (2001) A linear lower bound on the unbounded error probabilistic communication complexity. In SCT: Annual Conference on Structure in Complexity Theory, IEEE Computer Society Press, pp. 100–106.Google Scholar

[10]Forster, J., Krause, M., Lokam, S. V., Mubarakzjanov, R., Schmitt, N. and Simon, H. U. (2001) Relations between communication complexity, linear arrangements, and computational complexity. In Proc. 21st Conference on Foundations of Software Technology and Theoretical Computer Science, pp. 171–182.CrossRef Google Scholar

[11]Forster, J., Schmitt, N. and Simon, H. U. (2001) Estimating the optimal margins of embeddings in Euclidean half spaces. In Proc. 14th Annual Conference on Computational Learning Theory (COLT 2001) and 5th European Conference on Computational Learning Theory (EuroCOLT 2001), Vol. 2111 of Lecture Notes in Computer Science, Springer, Berlin, pp. 402–415.CrossRef Google Scholar

[12]Forster, J. and Simon, H. U. (2006) On the smallest possible dimension and the largest possible margin of linear arrangements representing given concept classes. Theor. Comput. Sci. 350 40–48.CrossRef Google Scholar

[13]Jameson, G. J. O. (1987) Summing and Nuclear Norms in Banach Space Theory, London Mathematical Society Student Texts, Cambridge University Press.CrossRef Google Scholar

[14]Johnson, W. B. and Lindenstrauss, J. (1984) Extensions of Lipshitz mappings into a Hilbert space. In Conference in Modern Analysis and Probability, AMS, Providence, RI, pp. 189–206.CrossRef Google Scholar

[15]Kashin, B. and Razborov, A. (1998) Improved lower bounds on the rigidity of Hadamard matrices. Mathematical Notes 63 471–475.CrossRef Google Scholar

[16]Kremer, I., Nisan, N. and Ron, D. (1995) On randomized one-round communication complexity. In Proc. 35th IEEE FOCS, pp. 596–605.CrossRef Google Scholar

[17]Kushilevitz, E. and Nisan, N. (1997) Communication Complexity, Cambridge University Press.Google Scholar

[18]Lee, T., Shraibman, A. and Špalek, R. (2008) A direct product theorem for discrepancy. In Annual IEEE Conference on Computational Complexity, pp. 71–80.CrossRef Google Scholar

[19]Linial, N., Mendelson, S., Schechtman, G. and Shraibman, A. (2007) Complexity measures of sign matrices. Combinatorica 27 439–463.CrossRef Google Scholar

[20]Linial, N. and Shraibman, A. (2007) Lower bounds in communication complexity based on factorization norms. In Proc. 39th ACM STOC, pp. 699–708.CrossRef Google Scholar

[21]Lokam, S. V. (1995) Spectral methods for matrix rigidity with applications to size-depth tradeoffs and communication complexity. In IEEE Symposium on Foundations of Computer Science, pp. 6–15.CrossRef Google Scholar

[22]Matoušek, J. (1999) Geometric Discrepancy: An Illustrated Guide, Vol. 18 Algorithms and Combinatorics, Springer.CrossRef Google Scholar

[23]Matoušek, J. (2002) Lectures on Discrete Geometry, Vol. 212 of Graduate Texts in Mathematics, Springer.CrossRef Google Scholar

[24]Mendelson, S. (2005) Embeddings with a Lipschitz function. Random Struct. Alg. 27 25–45.CrossRef Google Scholar

[25]Mendelson, S. (2005) On the limitations of embedding methods. In Proc. 18th Annual Conference on Learning Theory (COLT05), Vol. 3559 of Lecture Notes in Computer Science, Springer, pp. 353–365.Google Scholar

[26]Paturi, R. and Simon, J. (1986) Probabilistic communication complexity. J. Comput. Syst. Sci. 33 106–123.CrossRef Google Scholar

[27]Pisier, G. (1986) Factorization of Linear Operators and Geometry of Banach Spaces, Vol. 60 of CBMS Regional Conference Series in Mathematics, Published for the Conference Board of the Mathematical Sciences, Washington, DC.CrossRef Google Scholar

[28]Pudlák, P. and Rödl, V. (1994) Some combinatorial–algebraic problems from complexity theory. Discrete Math. 136 253–279.CrossRef Google Scholar

[29]Sherstov, A. A. (2008) Communication complexity under product and nonproduct distributions. In Annual IEEE Conference on Computational Complexity, pp. 64–70.CrossRef Google Scholar

[30]Shokrollahi, M. A., Spielman, D. A. and Stemann, V. (1997) A remark on matrix rigidity. Inform. Process. Lett. 64 283–285.CrossRef Google Scholar

[31]Tarui, J. (1993) Randomized polynomials, threshold circuits and polynomial hierarchy. Theoret. Comput. Sci. 113 167–183.CrossRef Google Scholar

[32]Valiant, L. G. (1977) Graph-theoretic arguments in low level complexity. In Proc. 6th MFCS, Vol. 53 of Lecture Notes in Computer Science, Springer, pp. 162–176.Google Scholar

[33]Vapnik, V. N. (1999) The Nature of Statistical Learning Theory, Springer, New York.Google Scholar

[34]Yao, A. (1983) Lower bounds by probabilistic arguments. In Proc. 15th ACM STOC, pp. 420–428.CrossRef Google Scholar

Article contents

Learning Complexity vs Communication Complexity

Abstract

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests