2 - Distributional Representations
from Part I - Theory
Published online by Cambridge University Press: 07 September 2023
Summary
The distributional representation of a lexical item is typically a vector representing its co-occurrences with linguistic contexts. This chapter introduces the basic notions to construct distributional semantic representations from corpora. We present (i) the major types of linguistic contexts used to characterize the distributional properties of lexical items (e.g., window-based and syntactic collocates and documents) , (ii) their representation with co-occurrence matrices, whose rows are labeled with lexemes and columns with contexts, (iii) mathematical methods to weight the importance of contexts (e.g., Pointwise Mutual Information and entropy), ( iv) the distinction between high-dimensional explicit vectors and low-dimensional embeddings with latent dimensions, (v) dimensionality reduction methods to generate embeddings from the original co-occurrence matrix (e.g., Singular Value Decomposition), and (vi) vector similarity measures (e.g., cosine similarity).
- Type
- Chapter
- Information
- Distributional Semantics , pp. 26 - 88Publisher: Cambridge University PressPrint publication year: 2023