TEITOK visualization and search interface for Beja


Language nameBejabeja1238
Language familyAfro-Asiaticafro1255
Corpus creatorVanhove, Martine
Translations providedEnglish
Glossesall
Annotation file licenceCC BY-NC

This is an interface for visualizing and searching the Beja DoReCo dataset. For more information about this dataset, including metadata, consult the DoReCo dataset page, where you can also download the data. Use the links in the left-side menu to search through this dataset, or to access individual documents for visualization.

When using actual data from the Beja DoReCo dataset in publications please cite

Vanhove, Martine. 2024. Beja DoReCo dataset. In Seifart, Frank, Ludger Paschen and Matthew Stave (eds.). Language Documentation Reference Corpus (DoReCo) 2.0. Lyon: Laboratoire Dynamique Du Langage (UMR5596, CNRS & Université Lyon 2). https://doreco.huma-num.fr/languages/beja1238 (Accessed on 23/01/2026). DOI:10.34847/nkl.6eaf5laq

When using results obtained from DoReCo's TEITOK version in publications, such as frequency counts obtained through the TEITOK search function, please cite — in addition to the reference to the Bora DoReCo dataset:

Janssen, Maarten & Frank Seifart. 2025. Searchable Language Documentation Corpora: DoReCo meets TEITOK. In: Éric Le Ferrand, Elena Klyachko, Anna Postnikova, Tatiana Shavrina, Oleg Serikov, Ekaterina Voloshina & Ekaterina Vylomova (eds.), Proceedings of the Fourth Workshop on NLP Applications to Field Linguistics, 58–64. Vienna, Austria: Association for Computational Linguistics. https://aclanthology.org/2025.fieldmatters-1.5/.

Gloss Abbreviations

Below is the list of language-specific glosses used in the Beja corpus:

GlossLGRMeaning
11first person
22second person
33third person
ABLABLablative
ABSTnoneabstract suffix
ACCACCaccusative
ACMPnoneperfective
ADJADJadjective
ADJVZnoneadjectivizer
ADJZnoneadjectivizer
ADJZRnoneadjectivizer
ADREnoneaddressee
ADRFnoneform of address
ADVSnoneadversative
ANTnoneanterior
AORnoneaorist
ATTENUnone(unclear)
AUGnoneaugmentative
CAUSCAUScausative
circnonecircumfix
CMPRnonecomparative
COLLnone(unclear)
COMCOMcomitative
CONTRnonecontrastive
COORDnonecoordinative
COPCOPcopula
CSLnonecausal
CVBnoneconverb
DATDATdative
DBLnonedouble
DEFDEFdefinite
DIMnonediminuitive
DIRnonedirectional
DISTDISTdistal
DISTRDISTRdistributive
DMnonediscourse marker
EMPHnoneemphatic
EXCMnoneexclamation
FFfemale
FSnonefalse start
FUTLFUTfuture
GENGENgenitive
GNRLnonegeneral
HESITnonehesitation
IMPIMPimperative
INDFINDFindefinite
INTnoneintensive
INTJnoneinterjection
IPFVIPFVimperfective
Lnonelinker
LINKnonelinker
LOCLOClocative
MMmasculine
MIDnonemiddle
MNRnonemanner
MOYnonemiddle
N.ACnoneaction noun
N.AGNnoneagent noun
NEGNEGnegation
NMLnone(unclear)
NMLZNMLZnominalizer
NOMNOMnominative
NREGnone(unclear)
OBJOBJobject
OPTnoneoptative
ORDnoneordinal
PASSPASSpassive
PFVPFVperfective
PLPLplural
PLACnonepluractional
PLCnone(unclear)
POSSPOSSpossessive
POTnonepotential
PROHPROHprohibitive
PROXPROXproximal
RCPTnonerecipient
RECPRECPreciprocal
REFLREFLreflexive
RELnonerelative clitic marker
SEQnonesequential
SGSGsingular
SIMILnonesimilative
SINGnonesingulative
SMLTnonesimultaneity
VBLZVBLZverbalizer
VNnoneverbonominal
VOCVOCvocative