TEITOK visualization and search interface for Sanzhi Dargwa


Language nameSanzhi Dargwasanz1248
Language familyNakh-Daghestaniannakh1245
Corpus creatorForker, Diana and Schiborr, Nils Norman
Translations providedRussian/English
Glossessome
Annotation file licenceCreative Commons Attribution License

This is an interface for visualizing and searching the Sanzhi Dargwa DoReCo dataset. For more information about this dataset, including metadata, consult the DoReCo dataset page, where you can also download the data. Use the links in the left-side menu to search through this dataset, or to access individual documents for visualization.

When using actual data from the Sanzhi Dargwa DoReCo dataset in publications please cite

Forker, Diana and Schiborr, Nils Norman. 2024. Sanzhi Dargwa DoReCo dataset. In Seifart, Frank, Ludger Paschen and Matthew Stave (eds.). Language Documentation Reference Corpus (DoReCo) 2.0. Lyon: Laboratoire Dynamique Du Langage (UMR5596, CNRS & Université Lyon 2). https://doreco.huma-num.fr/languages/sanz1248 (Accessed on 23/01/2026). DOI:10.34847/nkl.6eaf5laq

When using results obtained from DoReCo's TEITOK version in publications, such as frequency counts obtained through the TEITOK search function, please cite — in addition to the reference to the Bora DoReCo dataset:

Janssen, Maarten & Frank Seifart. 2025. Searchable Language Documentation Corpora: DoReCo meets TEITOK. In: Éric Le Ferrand, Elena Klyachko, Anna Postnikova, Tatiana Shavrina, Oleg Serikov, Ekaterina Voloshina & Ekaterina Vylomova (eds.), Proceedings of the Fourth Workshop on NLP Applications to Field Linguistics, 58–64. Vienna, Austria: Association for Computational Linguistics. https://aclanthology.org/2025.fieldmatters-1.5/.

Gloss Abbreviations

Below is the list of language-specific glosses used in the Sanzhi Dargwa corpus:

GlossLGRMeaning
11first person
22second person
33third person
1/2nonefirst/second person
[I]nonecode switch to Icari Dargwa
[R]nonecode swtich Russian
MODALnonemodal
ABLABLablative
ADJVZnoneadjectivizer
ADVZnoneadverbializer
ALLATALLallative
ANTEnonespatial case 'before'
ATTRnoneattributive
CAUSCAUScausative
COMITCOMcomitative
COMPnonecomparative
CONDCONDconditional
CVBnoneperfective converb
DATDATdative
DEMDEMdemonstrative
EMPHnoneemphatic particle
ERGERGergative
FFfeminine
GENGENgenitive
GL_FILLERnonepause filler
HABHABhabitual
HPLnonehuman plural
ICVBnoneimperfective converb
IMPIMPimperative
INnonespatial case 'in'
INDEFINDFindefinite
INF1nonenon-inflecting infinitive
INF2noneinflecting infinitive
IPFVIPFVimperfective
LATnonelative
LOCLOClocative
MMmasculine
MODQnonemodal interrogative
MSDnonemasdar
NNneuter
NCnonenot considered
NEGNEGnegation
NMLZNMLZnominalizer
NPLnoneneuter plural
NUMNUMnumeral
OBLOBLoblique
ORDnoneordinal
PFVPFVperfective
PLPLplural
POSTnonespatial case 'behind'
PRETnonepreterite
PROHnoneprohibitive
PRSPRSpresent
PRTPRTparticle
PSTPSTpast
PTCPPTCPparticiple
REFLREFLreflexive
SGSGsingular
SPRnonespatial case 'on'
SUBnonespatial case 'under'