Determiner-Established Deixis to Communicative Artifacts in Pedagogical Text

This page contains links and documentation for the datasets mentioned in the paper.

Word Senses

Here is a CSV file containing the raw annotations from both annotators. Each row represents a unique synset/gloss, and the columns contain the following:

The 200 rows in the above file represent the VCS set described in the paper.

Here is a CSV file containing the final annotations, with disagreements resolved. Each row represents a unique synset/gloss, and the columns contain the following:

The 62 'y' rows in the above file represent the CCS set described in the paper.

Candidate Instances

Here is a zip file containing CSV files for each of the 122 Wikibooks included in the paper analysis. Each row represents a match for the sought dependency patterns (i.e., a candidate instance of communicative deixis), and the columns represent the following:

Wikibooks

Finally, here is a zip file containing the HTML files for Wikibooks in the paper analysis, along with the markdown files generated from them. Due to their size, CoreNLP results on the Wikibooks are available by request. Generating them may be a faster option.


Read more about me or find my contact information here.