Book contents
- Frontmatter
- Contents
- Notes on Contributors
- 1 Introducing Arabic Corpus Linguistics
- 2 Under the Hood of arabiCorpus
- 3 Tunisian Arabic Corpus: Creating a Written Corpus of an ‘Unwritten’ Language
- 4 Accessible Corpus Annotation for Arabic
- 5 The Leeds Arabic Discourse Treebank: Guidelines for Annotating Discourse Connectives and Relations
- 6 Using the Web to Model Modern and Qurʾanic Arabic
- 7 Semantic Prosody as a Tool for Translating Prepositions in the Holy Qurʾan: A Corpus-Based Analysis
- 8 A Relational Approach to Modern Literary Arabic Conditional Clauses
- 9 Quantitative Approaches to Analysing come Constructions in Modern Standard Arabic
- 10 Approaching Text Typology through Cluster Analysis in Arabic
- Appendix: Arabic Transliteration Systems Used in This Book
- Index
1 - Introducing Arabic Corpus Linguistics
Published online by Cambridge University Press: 11 November 2020
- Frontmatter
- Contents
- Notes on Contributors
- 1 Introducing Arabic Corpus Linguistics
- 2 Under the Hood of arabiCorpus
- 3 Tunisian Arabic Corpus: Creating a Written Corpus of an ‘Unwritten’ Language
- 4 Accessible Corpus Annotation for Arabic
- 5 The Leeds Arabic Discourse Treebank: Guidelines for Annotating Discourse Connectives and Relations
- 6 Using the Web to Model Modern and Qurʾanic Arabic
- 7 Semantic Prosody as a Tool for Translating Prepositions in the Holy Qurʾan: A Corpus-Based Analysis
- 8 A Relational Approach to Modern Literary Arabic Conditional Clauses
- 9 Quantitative Approaches to Analysing come Constructions in Modern Standard Arabic
- 10 Approaching Text Typology through Cluster Analysis in Arabic
- Appendix: Arabic Transliteration Systems Used in This Book
- Index
Summary
Introduction
Arabic is a major world language, spoken not only in the Arabian peninsula, but by hundreds of millions of people across northern Africa and western Asia, and more broadly around the world. Corpus linguistics – the analysis of very large amounts of natural language data using computer-assisted methods and techniques – is a major methodology in modern linguistics. Yet, so far, relatively few studies have attempted to apply this major methodology to this major language. We may say, then, that Arabic corpus linguistics as a research endeavour is still in its infancy.
This volume represents an attempt by its authors and editors to help foster its development by bringing together cutting-edge contributions on the data, methods and research foci of this nascent field. Our aim is not merely to place on record present work of this kind but also, we hope, to showcase the intersection of Arabic linguistics and corpus-based methods in such a way as to inspire future work in the area. We feel strongly that this book represents the starting-point for major developments still to come in Arabic corpus linguistics.
Our goal in this introductory chapter is to set the scene for the contributions to follow in the remainder of the book. In doing so, we have attempted to address the perspectives of three main groups of readers who we anticipate will find this book of interest. Researchers and students working in Arabic corpus linguistics are only the first of these groups. We also address here (1) corpus linguists (or those in allied fields such as computational linguistics, natural language processing, or digital humanities) who have little experience of working with Arabic; and (2) Arabic linguists with little experience of corpus methods.
With this in mind, our scene-setting necessarily involves a brief introduction to corpus linguistics on the one hand, and Arabic linguistics on the other. The next section addresses the latter of these goals, and sketches in outline those features of the Arabic language which are most important as background for an understanding of the various chapters in this book. As part of this, we will introduce the transliteration scheme used throughout the volume.
- Type
- Chapter
- Information
- Arabic Corpus Linguistics , pp. 1 - 16Publisher: Edinburgh University PressPrint publication year: 2018