The Czech National Corpus: research infrastructure for empirical language-oriented inquiry

The Czech National Corpus: research infrastructure for empirical language-oriented inquiry

The Czech National Corpus (CNC) project, established in 1994, strives to continually map the Czech language in all available dimensions (from the time, regional and genre perspective). The CNC builds and makes available large electronic text collections (language corpora) serving as a basis for research on current Czech (both written and spoken) as well as historical Czech and other languages. It also develops the methodology of empirical linguistic research and tools for language corpora exploration. Since 2012 the CNC has been recognized as a research infrastructure for empirical language‑oriented inquiry in many fields of social sciences and humanities (esp. linguistics, psychology, sociology, history, etc.). Thanks to its large and high‑quality language resources, the CNC is a sought‑after partner in many international research projects. Besides these activities, CNC also focuses on consulting, providing analyses for research or popularizing purposes, providing data packages for research on Czech as well as other languages for contrastive research, and automatic text processing.

Key collaborators

Selected outputs

  • Cvrček, V. et al. (2010): Mluvnice současné češtiny. Nakladatelství Karolinum, Praha. ISBN 978-80-246-1743-5.

  • Čermák, F. – Křen, M. (eds) (2011): A Frequency Dictionary of Czech: Core Vocabulary for Learners. Routledge, London. ISBN 978-0-415-57661-1.

  • Čermák, F. – Rosen, A. (2012): The case of InterCorp, a multilingual parallel corpus. In International Journal of Corpus Linguistics, 17(3), 411–427.

  • Křen, M. et al. (2016): SYN2015: Representative Corpus of Contemporary Written Czech. In: Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16), pp. 2522–2528. Portorož: ELRA. ISBN 978-2-9517408-9-1.

Last change: August 1, 2016 16:12 
Share on:  
Contact Us

Charles University

Ovocný trh 5

Prague 1

116 36

Czech Republic

Centre for Information, Counselling and Social Services


Phone: +420 224 491 850

Public Relations Officer


Phone: +420 224 491 248

Data Box ID: piyj9b4

ID No.: 00216208

VAT No.: CZ00216208

How to Reach Us