TEITOK visualization and search interface for Dolgan


Language nameDolgandolg1241
Language familyTurkicturk1311
Corpus creatorDäbritz, Chris Lasse and Kudryakova, Nina and Stapert, Eugénie and Arkhipov, Alexandre
Translations providedEnglish/German/Russian
Glossesall
Annotation file licenceCreative Commons Attribution License

This is an interface for visualizing and searching the Dolgan DoReCo dataset. For more information about this dataset, including metadata, consult the DoReCo dataset page, where you can also download the data. Use the links in the left-side menu to search through this dataset, or to access individual documents for visualization.

When using actual data from the Dolgan DoReCo dataset in publications please cite

Däbritz, Chris Lasse and Kudryakova, Nina and Stapert, Eugénie and Arkhipov, Alexandre. 2024. Dolgan DoReCo dataset. In Seifart, Frank, Ludger Paschen and Matthew Stave (eds.). Language Documentation Reference Corpus (DoReCo) 2.0. Lyon: Laboratoire Dynamique Du Langage (UMR5596, CNRS & Université Lyon 2). https://doreco.huma-num.fr/languages/dolg1241 (Accessed on 23/01/2026). DOI:10.34847/nkl.6eaf5laq

When using results obtained from DoReCo's TEITOK version in publications, such as frequency counts obtained through the TEITOK search function, please cite — in addition to the reference to the Bora DoReCo dataset:

Janssen, Maarten & Frank Seifart. 2025. Searchable Language Documentation Corpora: DoReCo meets TEITOK. In: Éric Le Ferrand, Elena Klyachko, Anna Postnikova, Tatiana Shavrina, Oleg Serikov, Ekaterina Voloshina & Ekaterina Vylomova (eds.), Proceedings of the Fourth Workshop on NLP Applications to Field Linguistics, 58–64. Vienna, Austria: Association for Computational Linguistics. https://aclanthology.org/2025.fieldmatters-1.5/.

Gloss Abbreviations

Below is the list of language-specific glosses used in the Dolgan corpus:

GlossLGRMeaning
11first person
22second person
33third person
ABLABLablative
ACCACCaccusative
ADJZnoneadjectivizer
ADVZnoneadverbializer
AFFIRMnoneaffirmation
AGnoneagent
ANTnoneanteriority
APRXnoneapproximative
CAPnone(unclear)
CAUSCAUScausative
COLLnonecollective
COMCOMcomitative
COMPnonecomparative
CONDCONDconditional
CVBCVBconverb
DATDATdative
DIMnonediminuitive
DISTRnonedistributive
DRVnone(unclear)
DUDUdual
EMOTnone(unclear)
EMPHnoneemphasis
EPnoneepenthetic
EVIDnoneevidential
EXEXCLexclusive
FREQnonefrequentative
FUTFUTfuture
GENGENgenitive
HABnonehabitual
IMPIMPimperative
INCHnoneinchoative
INDEFINDFindefinite
INFERnoneinferential
INSTRINSinstrumental
INTJnoneinterjection
INTNSnoneintensive particle
ITERnoneiterative
LIMnonelimitative
LOCLOClocative
MEDnone(unclear)
MLTPnonemultiplicative
MODnonemodal
MULTnonemultiplicative
NECnonenecessary
NEGNEGnegation
NMNZNMLZnominalizer
NOMNOMnominative
ORDnoneordinal
PARTnonepartitive
PASSPASSpassive
PLPLplural
POSSPOSSpossessive
POTnonepotential
PROPRnoneproper name
PRSPRSpresent
PST1nonepast 1
PST2nonepast 2
PTCPPTCPparticiple
PURPnonepurposive
Qnonequestion particle
RECPRECPreciprocal
REFLREFLreflexive
SEQnonesequential
SGSGsingular
SIMnonesimultaneity
TEMPnonetemporal
VBZnoneverbalizer