Book contents
- Frontmatter
- Contents
- Notes on Contributors
- 1 Introducing Arabic Corpus Linguistics
- 2 Under the Hood of arabiCorpus
- 3 Tunisian Arabic Corpus: Creating a Written Corpus of an ‘Unwritten’ Language
- 4 Accessible Corpus Annotation for Arabic
- 5 The Leeds Arabic Discourse Treebank: Guidelines for Annotating Discourse Connectives and Relations
- 6 Using the Web to Model Modern and Qurʾanic Arabic
- 7 Semantic Prosody as a Tool for Translating Prepositions in the Holy Qurʾan: A Corpus-Based Analysis
- 8 A Relational Approach to Modern Literary Arabic Conditional Clauses
- 9 Quantitative Approaches to Analysing come Constructions in Modern Standard Arabic
- 10 Approaching Text Typology through Cluster Analysis in Arabic
- Appendix: Arabic Transliteration Systems Used in This Book
- Index
3 - Tunisian Arabic Corpus: Creating a Written Corpus of an ‘Unwritten’ Language
Published online by Cambridge University Press: 11 November 2020
- Frontmatter
- Contents
- Notes on Contributors
- 1 Introducing Arabic Corpus Linguistics
- 2 Under the Hood of arabiCorpus
- 3 Tunisian Arabic Corpus: Creating a Written Corpus of an ‘Unwritten’ Language
- 4 Accessible Corpus Annotation for Arabic
- 5 The Leeds Arabic Discourse Treebank: Guidelines for Annotating Discourse Connectives and Relations
- 6 Using the Web to Model Modern and Qurʾanic Arabic
- 7 Semantic Prosody as a Tool for Translating Prepositions in the Holy Qurʾan: A Corpus-Based Analysis
- 8 A Relational Approach to Modern Literary Arabic Conditional Clauses
- 9 Quantitative Approaches to Analysing come Constructions in Modern Standard Arabic
- 10 Approaching Text Typology through Cluster Analysis in Arabic
- Appendix: Arabic Transliteration Systems Used in This Book
- Index
Summary
Introduction
After learning Standard Arabic at the Defense Language Institute, I learned Egyptian Arabic and a few other varieties (both in classroom settings and on my own), and I found that I very much enjoyed learning about the different varieties of Arabic. Then, in 2007, I decided to study Tunisian Arabic. My experience was that each variety of Arabic I learned made it easier to learn others, so I saw no reason not to branch out into Tunisian. I soon came across significant difficulties, however, when I discovered that – unlike Egyptian – Tunisian was almost entirely bereft of languagelearning resources. There was no published dictionary, grammar reference, or basic coursebook similar to what I had used when learning Egyptian. I was nonetheless highly motivated to learn Tunisian (my now-husband is Tunisian and I wanted to be able to communicate with his family), so I persevered in studying it by collecting all the unpublished materials I could find. My frustration with this situation, though, motivated me to create modern, high-quality materials myself. Below I describe the first step of this process, on which all future projects will build: the creation of the first-ever corpus of Tunisian Arabic.
Arabic and Tunisian Arabic
As has already been outlined elsewhere in this volume, Arabic is a language of many varieties. The standard form of the language, generally referred to in English as Standard Arabic or Modern Standard Arabic (MSA), is closely related to the language of the Qurʾan and classical poetry. It varies very little across the different Arab countries and is highly respected by speakers of Arabic, many of whom believe it to be the most beautiful and perfect form of the language – or of any human language. Almost all written communication, including literature, is conducted in this form of the language, as are formal spoken communications such as news broadcasts and political speeches.
Standard Arabic is an important part of pan-Arab and Muslim identity, as it links Arabic speakers with a cultural past dating to before the revelation of Islam and the beginnings of the Islamic empire in the seventh century CE.
- Type
- Chapter
- Information
- Arabic Corpus Linguistics , pp. 30 - 55Publisher: Edinburgh University PressPrint publication year: 2018