My research brings together natural language processing (NLP), privacy, and computational social science. I direct the Human Language Technologies Lab at Penn State.

I am interested in solving problems to enable computers to do meaningful work with large volumes of natural language text. My lab develops new methods for NLP and applies them to a variety of domains. I am particularly interested in breaking down technology's "walls of text", i.e., situations where a human user or decision-maker is expected to consume a large quantity of text to take action while lacking sufficient resources (time, expertise) to properly understand what they have been given. I have applied this paradigm to a variety of domains including privacy policies, scholarly manuscripts, social media data, speech transcripts, and emails. I am always interested in new problems involving text data to collaborate on.

My work with PrivaSeer and the Usable Privacy Policy Project uses a combination of crowdsourcing, NLP, and machine learning to extract key details from websites' and mobile apps' privacy policies. I am also interested in privacy in online social networks, and within that topic area I have published on personal disclosures, location sharing, and post deletion.

You can also browse my publications.


You can watch the video below (in which I am the first speaker) about the NLP components of the Usable Privacy Policy Project. It's part of a series of videos describing our work.

I speak first and last (and in the middle) of the video below promoting artificial intelligence research at Penn State. (Production credits and press release are here.)


Our work has received support from The National Science Foundation, The National Institutes of Health, The Center for Cybersecurity Research and Education, The Center for Social Data Analytics, The Usable Privacy Policy Project, The Pennsylvania Space Grant Consortium, and The Ohio Supercomputer Center.

NSF logo NIH logo Usable Privacy Policy Project logo
Pennsylvania Space Grant Consortium logo C-SoDA logo Ohio Supercomputer Center logo