Corpus Search

CQL Query: query builder | visualize | options

On this page, you can search through the entire language using the Corpus Query Language (CQL) of the Corpus Workbench. A basic query in CQL searches for a sequence of words, where each word is represented by square brackets, with restrictions on the word inside those brackets. The restrictions indicate which feature of the word to search for, followed by a regular expression indicating the desired value. For example, you can search for a word ending with the letter L followed by a word starting with the letter A as follows:

[ form = ".*l" ] [ form = "a.*" ]

For more search options, without using regular expressions, the interface provides a query builder to define queries in CQL. Just click on the query builder icon to open the query builder, define your query, and click the button to insert that query into the CQL query box. Then you can modify it by hand if needed, or simply hit "search". In the query builder, you can build more complex CQL queries that restrict the documents to search in, or the utterance to search for by restricting the results to a specific genre, or to the gender of the speaker. You can also search for utterances containing a word in the free translation tier - typing in "house" in the "Translation search" will find all utterances for which the free translation contains the word "house".

For more information about the searchable fields for each language, and how their information was obtained from the original EAF files, see the conversion page.

When using results obtained from DoReCo's TEITOK version in publications, such as frequency counts obtained through the TEITOK search function, please cite, in addition to the reference to the Bora DoReCo dataset:

Janssen, Maarten & Frank Seifart. 2025. Searchable Language Documentation Corpora: DoReCo meets TEITOK. In: Éric Le Ferrand, Elena Klyachko, Anna Postnikova, Tatiana Shavrina, Oleg Serikov, Ekaterina Voloshina & Ekaterina Vylomova (eds.), Proceedings of the Fourth Workshop on NLP Applications to Field Linguistics, 58–64. Vienna, Austria: Association for Computational Linguistics. https://aclanthology.org/2025.fieldmatters-1.5/.