TEITOK visualization and search interface for Northern Alta


Language nameNorthern Altanort2875
Language familyAustronesianaust1307
Corpus creatorGarcia-Laguia, Alexandro
Translations providedTagalog/English
Glossesall
Annotation file licenceCC BY-NC

This is an interface for visualizing and searching the Northern Alta DoReCo dataset. For more information about this dataset, including metadata, consult the DoReCo dataset page, where you can also download the data. Use the links in the left-side menu to search through this dataset, or to access individual documents for visualization.

When using actual data from the Northern Alta DoReCo dataset in publications please cite

Garcia-Laguia, Alexandro. 2024. Northern Alta DoReCo dataset. In Seifart, Frank, Ludger Paschen and Matthew Stave (eds.). Language Documentation Reference Corpus (DoReCo) 2.0. Lyon: Laboratoire Dynamique Du Langage (UMR5596, CNRS & Université Lyon 2). https://doreco.huma-num.fr/languages/nort2875 (Accessed on 23/01/2026). DOI:10.34847/nkl.6eaf5laq

When using results obtained from DoReCo's TEITOK version in publications, such as frequency counts obtained through the TEITOK search function, please cite — in addition to the reference to the Bora DoReCo dataset:

Janssen, Maarten & Frank Seifart. 2025. Searchable Language Documentation Corpora: DoReCo meets TEITOK. In: Éric Le Ferrand, Elena Klyachko, Anna Postnikova, Tatiana Shavrina, Oleg Serikov, Ekaterina Voloshina & Ekaterina Vylomova (eds.), Proceedings of the Fourth Workshop on NLP Applications to Field Linguistics, 58–64. Vienna, Austria: Association for Computational Linguistics. https://aclanthology.org/2025.fieldmatters-1.5/.

Gloss Abbreviations

Below is the list of language-specific glosses used in the Northern Alta corpus:

GlossLGRMeaning
11first person
22second person
33third person
ABSABSabsolutive
AVnoneactor voice
CAUCAUcausative
CMTVnone(unclear)
CVnoneconveyance voice
CWAnoneaffixed content word
CWA1noneaffixed content word
CWA2noneaffixed content word
dnonedeictic
DEMDEMdemonstrative
DISTDISTdistal
FDISTDISTfar distal
GENGENgenitive
GERnonegerund
INSTINSinstrumental
LKnonelinker
LOCLOClocative
LPROXPROXless proximal
LVnonelocative voice
MEDnonemedial
OBLOBLoblique
pPLplural
penoneplural exclusive
pinoneplural inclusive
PMPREDpredicate marker
POTnonepotentive
PRFPRFperfective
PRGPROGprogressive
PROXPROXproximal
PVnonepatient voice
RDPnonereduplication
sSGsingular
SMLnonesimilative
SPECnonespecificity marker
STnonestative