Skip to Main content Skip to Navigation
Conference papers

Annotating Verbal Multiword Expressions in Arabic: Assessing the Validity of a Multilingual Annotation Procedure

Najet Hadj Mohamed 1, 2 Cherifa Ben Khelil 1 Agata Savary 3 Iskandar Keskes 2 Jean-Yves Antoine 1 Lamia Belguith Hadrich 2 
1 BDTLN - Bases de données et traitement des langues naturelles
LIFAT - Laboratoire d'Informatique Fondamentale et Appliquée de Tours
3 ILES - Information, Langue Ecrite et Signée
LISN - Laboratoire Interdisciplinaire des Sciences du Numérique, STL - Sciences et Technologies des Langues
Abstract : This paper describes our efforts to extend the PARSEME framework to Modern Standard Arabic. The applicability of the PARSEME guidelines was tested by measuring the inter-annotator agreement in the early annotation stage. A subset of 1,062 sentences from the Prague Arabic Dependency Treebank PADT was selected and annotated by two Arabic native speakers independently. Following their annotations, a new Arabic corpus with over 1,250 annotated VMWEs has been built. This corpus already exceeds the smallest corpora of the PARSEME suite, and enables first observations. We discuss our annotation guideline schema that shows full MWE annotation is realizable in Arabic where we get good inter-annotator agreement.
Complete list of metadata

https://hal.archives-ouvertes.fr/hal-03712937
Contributor : Agata Savary Connect in order to contact the contributor
Submitted on : Monday, July 4, 2022 - 12:29:16 PM
Last modification on : Friday, August 5, 2022 - 9:27:31 AM

File

Hadj-Mohamed-et-al-2022-LREC-p...
Publisher files allowed on an open archive

Identifiers

  • HAL Id : hal-03712937, version 1

Citation

Najet Hadj Mohamed, Cherifa Ben Khelil, Agata Savary, Iskandar Keskes, Jean-Yves Antoine, et al.. Annotating Verbal Multiword Expressions in Arabic: Assessing the Validity of a Multilingual Annotation Procedure. 13th Conference on Language Resources and Evaluation (LREC 2022), Jun 2022, Marseille, France. pp.1839-1848. ⟨hal-03712937⟩

Share

Metrics

Record views

0

Files downloads

0