Welcome to edu-convokit’s documentation!


The edu-convokit is an open-source framework designed to facilitate the study of language data in educational settings. It provides a practical and efficient pipeline for essential tasks such as text pre-processing, annotation, and analysis, tailored to meet the needs of researchers and developers. This toolkit aims to enhance the accessibility and reproducibility of educational language data analysis, as well as advance both natural language processing (NLP) and education research. By simplifying these key operations, the edu-convokit supports the efficient exploration and interpretation of text data in education.


You can install the latest version of the edu-convokit from the GitHub repository:

!pip install edu-convokit

Colab Notebooks

Basics of edu-convokit

Datasets with edu-convokit



If you use the edu-convokit in your research, please cite the following paper:

TODO: coming soon...


We welcome contributions to the edu-convokit! Feel free to make a pull request or submit an issue on GitHub: https://github.com/rosewang2008/edu-convokit.


If you have any questions, please contact Rose E. Wang at rewang@cs.stanford.edu.