TEITOK visualization and search interface for Sanzhi Dargwa
This is an interface for visualizing and searching the Sanzhi Dargwa DoReCo
dataset. For more information about this dataset, including metadata, consult the DoReCo
dataset page,
where you can also download the data. Use the links in the left-side menu to search through this dataset, or to access individual documents for visualization.
When using actual data from the Sanzhi Dargwa DoReCo dataset in publications please cite
Forker, Diana and Schiborr, Nils Norman. 2024. Sanzhi Dargwa DoReCo dataset. In Seifart, Frank, Ludger Paschen and Matthew Stave (eds.). Language Documentation Reference Corpus (DoReCo) 2.0. Lyon: Laboratoire Dynamique Du Langage (UMR5596, CNRS & Université Lyon 2). https://doreco.huma-num.fr/languages/sanz1248 (Accessed on 23/01/2026). DOI:10.34847/nkl.6eaf5laq
When using results obtained from DoReCo's TEITOK version in publications, such as frequency counts obtained through the TEITOK search function, please cite — in addition to the reference to the Bora DoReCo dataset:
Janssen, Maarten & Frank Seifart. 2025. Searchable Language Documentation Corpora: DoReCo meets TEITOK. In: Éric Le Ferrand, Elena Klyachko, Anna Postnikova, Tatiana Shavrina, Oleg Serikov, Ekaterina Voloshina & Ekaterina Vylomova (eds.), Proceedings of the Fourth Workshop on NLP Applications to Field Linguistics, 58–64. Vienna, Austria: Association for Computational Linguistics. https://aclanthology.org/2025.fieldmatters-1.5/.
Gloss AbbreviationsBelow is the list of language-specific glosses used in the Sanzhi Dargwa corpus: | Gloss | LGR | Meaning |
|---|
| 1 | 1 | first person
|
|---|
| 2 | 2 | second person
|
|---|
| 3 | 3 | third person
|
|---|
| 1/2 | none | first/second person
|
|---|
| [I] | none | code switch to Icari Dargwa
|
|---|
| [R] | none | code swtich Russian
|
|---|
| MODAL | none | modal
|
|---|
| ABL | ABL | ablative
|
|---|
| ADJVZ | none | adjectivizer
|
|---|
| ADVZ | none | adverbializer
|
|---|
| ALLAT | ALL | allative
|
|---|
| ANTE | none | spatial case 'before'
|
|---|
| ATTR | none | attributive
|
|---|
| CAUS | CAUS | causative
|
|---|
| COMIT | COM | comitative
|
|---|
| COMP | none | comparative
|
|---|
| COND | COND | conditional
|
|---|
| CVB | none | perfective converb
|
|---|
| DAT | DAT | dative
|
|---|
| DEM | DEM | demonstrative
|
|---|
| EMPH | none | emphatic particle
|
|---|
| ERG | ERG | ergative
|
|---|
| F | F | feminine
|
|---|
| GEN | GEN | genitive
|
|---|
| GL_FILLER | none | pause filler
|
|---|
| HAB | HAB | habitual
|
|---|
| HPL | none | human plural
|
|---|
| ICVB | none | imperfective converb
|
|---|
| IMP | IMP | imperative
|
|---|
| IN | none | spatial case 'in'
|
|---|
| INDEF | INDF | indefinite
|
|---|
| INF1 | none | non-inflecting infinitive
|
|---|
| INF2 | none | inflecting infinitive
|
|---|
| IPFV | IPFV | imperfective
|
|---|
| LAT | none | lative
|
|---|
| LOC | LOC | locative
|
|---|
| M | M | masculine
|
|---|
| MODQ | none | modal interrogative
|
|---|
| MSD | none | masdar
|
|---|
| N | N | neuter
|
|---|
| NC | none | not considered
|
|---|
| NEG | NEG | negation
|
|---|
| NMLZ | NMLZ | nominalizer
|
|---|
| NPL | none | neuter plural
|
|---|
| NUM | NUM | numeral
|
|---|
| OBL | OBL | oblique
|
|---|
| ORD | none | ordinal
|
|---|
| PFV | PFV | perfective
|
|---|
| PL | PL | plural
|
|---|
| POST | none | spatial case 'behind'
|
|---|
| PRET | none | preterite
|
|---|
| PROH | none | prohibitive
|
|---|
| PRS | PRS | present
|
|---|
| PRT | PRT | particle
|
|---|
| PST | PST | past
|
|---|
| PTCP | PTCP | participle
|
|---|
| REFL | REFL | reflexive
|
|---|
| SG | SG | singular
|
|---|
| SPR | none | spatial case 'on'
|
|---|
| SUB | none | spatial case 'under'
|
|---|
| | |
|---|
|