Natural Language Processing Analysis of Online Students’ Discussion Board Participation

Jackie Wickham

Courses: Fall 2015 MS in Global Health 405, 408, 417, 427, & 452
Students: 15-25 students per course

View digital poster

Download the PDF – Wickham: Natural Language Processing Analysis of Online Students’ Discussion Board Participation


I wanted to look for patterns in the vast amount of data students in online classes generate in the process of participating in discussion forums.


I used the Natural Language Processing engine and the Discussion Analytics LTI to pull information from the Fall 2015 course offerings in the online Master of Science in Global Health program, a total of five courses.

Objectives & Outcomes

Early faculty feedback indicates that there is a significant amount of interest in projects like this. The project has the potential to enhance student learning by capturing a picture of student interests and participation levels and using that information to further develop course resources. For instance, the NLP engine captures organizations that students mention in discussions. Faculty can use this information to look for resources published by those organizations to enhance the discussion or invite guest speakers from those organizations to class.


I was able to pull data from five courses—all of the Fall 2015 offerings in the Master of Science in Global Health program. From this data, I saw significant differences in the amount of student participation in each course as well as the topics being discussed in the courses.

Results are available at

Lessons Learned

I was surprised at how much data I was able to gather from a small sample size. If I did the project again, I might compare the participation level in the same course from term to term, or look at other variables in the courses that might affect student discussion participation.