News Archive

2023-10-25: An article about one of our recent DocEng publications was the top news item in today's Penn State Today.

2023-10-23: Radio host Scott Geesey on 98.7 The Fox (State College, PA) interviewed me live on air about artificial intelligence.

2023-10-20: I participated in an ICDS Symposium panel on "Navigating the Promises and Pitfalls of Generative AI". Thanks to Daryl Lim and Wes Reinhart for great conversation and Vasant Honavar for moderating.

2023-10-06: Penn State Strategic Communications recorded an informal practice interview I did about artificial intelligence. The audio level of the interviewer is low because we weren't set up for an actual production interview.

2023-09-26: Thanks to Auburn University's AI@AU Initiative for hosting my invited talk "Natural Language Processing for Privacy, Empowerment, and Social Good".

2023-08-28: Congrats to my advisees Mukund Srinath and Pranav Venkit and lab alumna Soundarya Sundareswara for winning the DocEng Best Student Paper Award with "Privacy Lost and Found: An Investigation at Scale of Web Privacy Policy Availability"!

2023-08-09: I speak in three recent WPMT TV news segments about the impact of AI on education, work, and public discourse.

2023-07-29: In Inglorious Proposals, I added a section with an overview of all the grant proposals I submitted during my first five pre-tenure years.

2023-07-18: Inside Higher Education quotes me about scam emails that target university students, and it refers to our lab's research on the peculiarities of scam emails in higher education.

2023-07-14: Wealth Management quotes me in an article about how generative AI poses a growing cybersecurity threat to investment advisors.

2023-07-02: Our TrustNLP submission "Automated Ableism: An Exploration of Explicit Disability Biases in AIaaS Sentiment and Toxicity Analysis Models" received a Best Short Paper Award! Thanks to my PhD students Pranav Venkit and Mukund Srinath for leading this work.

2023-05-17: WTAJ interviwed me (article and video segment) about some impacts of large language models on education.

2023-05-03: I led a birds-of-a-feather session about NLP on legal text at EACL 2023. Thanks to everyone who attended.

2023-04-30: I speak about cybersecurity risks in this Penn State News explainer about ChatGPT. Penn State News also recently published stories about my lab's work on nationality biases in GPT-2 and topics in scam emails sent to university addresses.

2023-04-22: Centre County Report interviewed me about artificial intelligence being used by swatting perpetrators. I speak briefly in the first news segment in this video.

2023-04-21: Here's a Penn State News article about my NSF CAREER award.

2023-03-24: I'm pleased to share that my NSF CAREER proposal has been awarded! I'll be using NLP to make consumer-oriented legal documents ("COLDs") more actionable and engaging.

2023-03-17: I did a live interview on NBC News NOW, during a news segment about the potential impact of ChatGPT on online dating. Here's a link to the beginning of the segment, and another to when I start talking.

2023-03-09: I was an invited participant at the NSF SaTC Vision 2.0 Workshop in Dallas, Texas.

2023-03-08: Our paper "Creation and Analysis of a Corpus of Scam Emails Targeting Universities" was accepted to WebConf 2023.

2023-03-06: The report from the Dagstuhl seminar I recently participated in, "Privacy in Speech and Language Technology", is available online.

2023-02-20: I added Inglorious Proposals to my advice pages. It contains chronology case studies and observations on applying for research funding from NSF.

2023-02-10: Thanks to Penn State's Center for Language Science for hosting my talk "Sociodemographic Biases in Natural Language Processing: Two Case Studies" as part of the Language Meets Technology speaker series.

2023-01-31: Thanks to Norman Sadeh and Hana Habib for virtually hosting my talk "Natural Language Processing for Privacy and Social Good" as part of Carnegie Mellon University's Privacy Seminar.

2023-01-27: Our proposal SaTC: CORE: Small: Toward Privacy Equity through Contextual Understanding of Self-Disclosure" has been awarded. I'll be working with colleague Sarah Rajtmajer on studying relationships between socioeconomic status and individuals' privacy behaviors in social media.

2023-01-21: Our paper "An Exploratory Study of Demonym Biases in GPT-2" was accepted to appear at EACL.

2023-01-12: Our proposal "Understanding the Prevalence of Drinking Water Service Disruption through Large-Scale Analysis of News Articles and Social Media" was recently funded by Penn State's Center for Socially Responsible AI. Here's an announcement in Penn State News.

2022-12-09: This article in Penn State News describes my PhD student Younes Karimi's work to automatically identify doxing on Twitter.

2022-12-02: HLT Lab undergraduate researcher Nora O'Toole is featured in this Penn State News article.

2022-10-17: Here's an article in Penn State News about my PhD students' work identifying biases in language models against terms that describe people with disabilities.

2022-10-05: I've been selected to join the Steering Committee for Penn State's Center for Socially Responsible Artificial Intelligence.

2022-08-14: I will participate in the seminar "Privacy in Speech and Language Technology" at Dagstuhl in late August.

2022-07-01: We've released the GPI ("Government Privacy Instructions") Corpus, a collection of 1,043 privacy laws, regulations, and guidelines from 182 jurisdictions around the world. Read our ArXiv paper about it and download it here.

2022-06-02: I led (with Athina Markopoulou) a breakout session titled "Privacy, Policy, and People" at the NSF SaTC PI Meeting.

2022-04-22: Thanks to Sepideh Ghanavati for virtually hosting me for a talk with her lab at the University of Maine.

2022-04-06: We have two papers accepted to LREC: "STAPI: An Automatic Scraper for Extracting Iterative Title-Text Structure from Web Documents" and "A Tale of Two Regulatory Regimes: Creation and Analysis of a Bilingual Privacy Policy Corpus".

2022-02-04: I've been elected to Penn State's University Faculty Senate for a four-year term, starting in Fall 2022 and ending in Spring 2026.

2022-01-15: Our paper "Automated Detection of Doxing on Twitter" has been accepted to CSCW 2022. Here's a preprint on arXiv.

2021-12-17: I added a Guide for Student Research to Advice for Students.

2021-12-09: VentureBeat has an article that quotes my PhD student Pranav Venkit and me about our research to quantify bias in language models against people with disabilities. Here's our arXiv paper about the work.

2021-11-04: El País has an article about our ICWE 2021 paper on PrivaSeer.

2021-11-04: I added My Academic Journey to Advice for Students.

2021-09-16: I am chairing the selection committee for IST's Westin Scholars Award. I encourage undergraduates in the College of IST to apply.

2021-08-15: This Digital Trends article quotes me about the risk of privacy becoming a luxury feature, and the durability of our search engine preferences.

2021-07-28: Here's a press release about the $1.2M NSF award I'm leading to build corpora, language models, and proof-of-concept tools that make use of large quantities of privacy-related text on the web. Here's the project website.

2021-06-20: I added a Guide to Professorspeak to Advice for Students.

2021-05-25: Here's a press release about our interdisciplinary collaboration to study police language to find patterns that may lead to tragic outcomes when police encounter male minority youth.

2021-05-18: I'm hiring an Assistant Research Professor to join my lab.

2021-02-14: I added Thoughts on Competition in Academia to Advice for Students.

2021-01-15: Here's a press release about Opt-Out Easy, a web browser extension we created to help internet users protect their privacy. This was the product of a collaboration with researchers at Carnegie Mellon University, Stanford University, and the University of Michigan.

2020-12-03: I added Thoughts on the Advice to Advice for Students. It explains why I created Advice for Students and the history of that part of my website.

2020-10-28: This Penn State News article quotes me about attending Virtual Grace Hopper last month.

2020-10-24: Our paper "From Prescription to Description: Mapping the GDPR to a Privacy Policy Corpus Annotation Scheme" has been accepted for presentation at JURIX.

2020-08-11: My advisees presented three posters at (virtual) SOUPS. They covered our research on limitations in the availability of privacy policies on the web, automatically generating titles for sections of privacy policies, and differences in scam emails written in different languages.

2020-07-08: I co-hosted a mentoring session at (virtual) ACL 2020 about finding good advisors and mentors. Here are our slides from the session.

2020-06-16: I added Thoughts on Failure to my Advice for Students. It's a personal narrative about my biggest failures, with some advice from experience. Students who set challenging goals for themselves may find the contents of this page particularly relatable.

2020-05-20: I would be glad to work with recent PhD graduates on applications for CRA's 2020 CIFellows postdoctoral fellowship. If you're interested, please get in touch soon.

2020-05-09: Congratulations to IST's Spring 2020 graduates! I contributed to this celebratory video that our Office of Marketing and Communications put together.

2020-05-07: I added a Guide for Publishing in Computer/Information Science Venues to my Advice for Students.

2020-04-28: I'm pleased to announce the release of PrivaSeer, our search engine for the privacy policies of 1,005,781 English language websites. You can also read this arXiv paper about the corpus behind the search engine.

2020-01-24: Here's a press release about the Penn State - Univeristy of Auckland research collaboration workshop I attended in November 2019.

2020-01-17: Our paper "Finding a Choice in a Haystack: Automatic Extraction of Opt-Out Statements from Privacy Policy Text" has been accepted for publication and oral presentation at The Web Conference.

2020-01-02: Our proposal titled "Exploring the Effects of Socioeconomic Status on Privacy Behaviors in an Online Social Network" will be funded by Penn State's Center for Social Data Analytics. Sarah Rajtmajer and I will work together on this project.

2019-12-01: I recently attended two events: an NSF Workshop on Departmental Plans for Broadening Participation in Computer Science, where I was one of two faculty representing the College of IST, and a Penn State - University of Auckland research collaboration development workshop.

2019-11-11: The Fall 2019 issue of iConnect, the College of IST's magazine, contains a synopsis of my recent work to simplify online privacy using artificial intelligence. It's on page 22.

2019-10-25: I gave a talk for Graduates in IST about finding an academic job. Most of the talk was based upon material in my Guide for the Tenure-Track Job Market in Computer/Information Sciences.

2019-09-24: I wrote a Guide to Citations and References for students who are new to using citations and references, or need to refresh their memory. I wanted to create a guide with a "citation positive" perspective, which I explain in the first section.

2019-09-17: I will lead a breakout session titled "Making Privacy Understandable for Internet Users" at the NSF SaTC PI Meeting in Alexandria on October 27-29. We will discuss efforts to create software, datasets, and algorithms that can play a role in helping internet users understand information about online privacy, including the practices of online entities they interact with and basic information about how privacy works.

2019-08-31: Our paper "Question Answering for Privacy Policies: Combining Computational and Legal Perspectives" has been accepted to EMNLP 2019.

2019-07-23: Here's a press release from Penn State about our recent SaTC grant.

2019-07-12: The College of IST's Instagram account posted a Faculty Friday interview with me. One tiny correction: I first tried laksa when I spent a summer in Singapore, not Malaysia.

2019-07-09: Our NSF SaTC Medium proposal "Automatically Answering People's Privacy Questions" was chosen for funding! I'll be working with Norman Sadeh at Carnegie Mellon University's School of Computer Science and Joel Reidenberg at Fordham University's School of Law to create question answering systems for online privacy.

2019-06-02: I wrote a Guide for Scholarly Writing, aimed at students preparing manuscripts to submit to publication venues.

2019-05-21: I recently spoke at the graduation ceremony for the Philosophy Department at Virginia Tech. I added a video of my speech to the miscellany page.

2019-04-27: I added a Teaching FAQ to my website with information for students enrolled in my classes, especially first-year college students and international students.

2019-04-18: I speak first and last (and in the middle) of this video promoting artificial intelligence research at Penn State. Thanks to PSU's CommAgency students for putting this together.

2019-04-05: I received a seed grant from Penn State's Center for Security Research and Education for a proposal titled "Automatic Detection of Malicious Emails Using Discourse Analysis", with Susan Strauss as a co-PI.

2019-04-02: Welcome to Carly Chiavaroli, who is joining my lab in a research support position.

2019-03-29: Thanks to everyone who attended PAL! If you didn't make it, you can still view the proceedings on our website.

2019-03-22: I received a seed grant from the College of IST for my proposal titled "Web-Scale Search and Analysis of Privacy Policies", with Lee Giles as a co-PI. The grant will fund a co-supervised graduate research assistant for one year.

2019-03-21: Our paper "Vaccine: Obfuscating Access Pattern Against File-Injection Attacks" has been accepted to IEEE CNS.

2019-03-17: I've created an Advising FAQ with information for PhD students, MS students, undergraduates, and visitors who are interested in joining my lab.

2019-02-23: This May I will be the invited speaker for the graduation ceremony of the Philosophy Department at Virginia Tech, my alma mater. Philosophy was one of my majors as an undergraduate, and I am honored they selected me.

2019-01-11: Our paper "Analyzing privacy policies at scale: From crowdsourcing to automated annotations" has been published in ACM Transactions on the Web. Here's a copy.

2018-12-05: Thanks to Jamie Blustein and Computer Science at Dalhousie University for hosting me for a talk today.

2018-11-29: I am pleased to congratulate (belatedly) my Spring 2018 M.S. graduates on their new jobs: Abhijith Athreya is now a Chief Engineer at Samsung R&D, and Baradwaj Aryasomayajula is a Software Developer at Verizon.

2018-10-31: Later this week I will be at EMNLP 2018 to present our poster for our accepted paper. I will also chair the Social Applications II session, and I am looking for Ph.D. students to join my lab. Please chat with me if you are interested.

2018-10-16: Thanks to IST's communications and marketing team for including a section about my photography in this article. That's my picture at the top, and the section about me is toward the end.

2018-10-11: IST has multiple faculty openings. We have an open-rank opening in security and privacy and an assistant professor opening in human-centered design. We're also looking for teaching faculty.

2018-09-25: I am now a Faculty Affiliate of Penn State's Institute for CyberScience.

2018-08-30: Our paper "Supervised and Unsupervised Methods for Robust Separation of Section Titles and Prose Text in Web Documents" has been accepted for presentation at EMNLP in November. Here's the paper. The code and the datasets are on GitHub.

2018-08-10: I am organizing a AAAI Spring Symposium titled "Privacy-Enhancing Artificial Intelligence and Language Technologies" (PAL) at Stanford University on March 25-27, 2019. Consider submitting a paper, or help me promote it with this flyer.

2018-07-26: Congrats to my students Abhijith Mysore and Baradwaj Aryasomayajula on successfully defending their M.S. theses!

2018-07-19: Last week I led a workshop on natural language processing for 20 visiting faculty from Ming Chi University of Taiwan, as part of their visit to the University of Cincinnati. Best wishes for the rest of their stay.

2018-05-07: I was a faculty marshal for the College of Engineering and Applied Sciences at UC's Spring Commencement. Congratulations again to all of our graduates!

2018-03-23: The Usable Privacy Policy Project created a series of videos to explain our research. I speak first in this one, about the natural language processing aspects of the project.

2018-03-01: Our work is featured in NSF News from the Field. See the bottom of the press release (linked from the NSF article) for the full list of collaborators.

2018-02-17: I am proposing a workshop titled "AI and NLP for Usable Privacy" at SOUPS this coming August. The workshop will be a successor event to Privacy and Language Technologies, the AAAI Fall Symposium I organized in 2016. Contact me if you have nominations (including self-nominations) for the program committee.

2018-02-12: Congrats to my M.S. student Kaitlin Burnam on winning the graduate student poster award at the Tri-State Women in Computing Conference this past weekend!

2018-02-02: During the week of March 12, I will teach a mini-course titled "Ethics of Artificial Intelligence" at Future University in Cairo, Egypt.

2018-01-25: Check out our re-launch of the Explore site, now featuring over 7,000 automatically annotated privacy policies.

2017-11-14: Cincinnati's Channel 9 News interviewed me today about FaceID in the iPhone X. (Skip to 1:00 in the video for the correct story.) A few claims in the segment that aired were misattributed to me, although I was still glad to speak with them. Also, although they cited me as a face recognition expert, this is incorrect; I spoke as a privacy researcher.

2017-10-05: Thanks to the University of Cincinnati's IEEE student chapter for hosting me for a talk this evening.

2017-07-13: Our paper "Identifying the provision of choices in privacy policy text" has been accepted for presentation at EMNLP.

2017-06-14: Three of our posters have been accepted for presentation at SOUPS: "Increasing the salience of data use opt-outs online", "Nudges for privacy and security: Understanding and assisting users' choices online", and "Mobile app privacy compliance: Automated technology to help regulators, app stores and developers".

2017-05-13: Thanks to Hongning Wang for hosting me for a talk at the University of Virginia.

2017-04-29: I was a faculty marshal for the College of Engineering and Applied Sciences at UC's Spring Commencement. Congratulations again to all of our graduates!

2017-03-24: Our paper "PrivOnto: A semantic framework for the analysis of privacy policies" has been accepted for publication by the Semantic Web Journal, as part of its special issue "Semantic Web and Linked Data: Security, Privacy and Policy".

2017-02-17: Thanks to the Division of Biomedical Informatics at CCHMC for hosting my talk today as part of the Hutton Lecture Series.

2017-02-14: I have added more content to my miscellany page.

2017-01-03: Our paper "Nudges for privacy and security: Understanding and assisting users' choices online" has been accepted to appear in ACM Computing Surveys. Here's a working copy of the paper on SSRN. Also, Happy New Year!

2016-11-21: Thanks to Nathan Schneider for hosting me for a talk today at the Department of Computer Science at Georgetown University.

2016-11-19: Thanks to everyone who came to PLT! I'll be in contact soon with some post-event announcements.

2016-11-01: Students interested in CS 7052 this spring should take a look at my teaching page for information on the course.

2016-10-24: Our paper "Automating Privacy Law Analysis for Mobile Apps" has been accepted to appear at NDSS'17.

2016-09-09: On November 1 I will give an invited talk as part of HCOMP's Encore Track in Austin, Texas.

2016-07-25: We have released the OPP-115 Corpus, a unique resource of 115 website privacy policies annotated with 23K data practices. The corpus is described in detail in our recent ACL paper.

2016-06-03: Along with my colleagues Alessandro Oltramari and Fei Liu, I am organizing a AAAI Fall Symposium on privacy and language technologies. Visit the symposium website here and help us advertise with this flyer.

2016-05-30: Our paper "The Creation and Analysis of a Website Privacy Policy Corpus" has been accepted for poster presentation at ACL this August, and our poster abstract "Visualization and Interactive Exploration of Data Practices in Privacy Policies" has been accepted by SOUPS.

2016-04-13: Our paper "Crowdsourcing Annotations of Websites’ Privacy Policies: Can It Really Work?" has been selected as one of five Best Paper Finalists at the 25th World Wide Web Conference.

2016-03-31: Our paper "Demystifying Privacy Policies with Language Technologies: Progress and Challenges" has been accepted by the Text Analysis for Cybersecurity and Online Safety workshop at LREC. I will present it.

2016-03-20: Lifehacker posted an article about our data exploration website.

2016-03-15: The Consumerist wrote an article about our data exploration website.

2016-03-10: Here's a data exploration website that my group has created to showcase the privacy policy annotations that we've collected. Also, here's a press release about it.

2016-01-05: I gave a talk as part of the CHIME seminar series at the National University of Singapore. Thanks to Min-Yen Kan for hosting me.

2015-12-16: Our paper "Crowdsourcing Annotations of Websites’ Privacy Policies: Can It Really Work?" has been accepted for presentation at the 25th World Wide Web Conference. The camera-ready version is here.

2015-10-18: Our paper "This Table is Different: A WordNet-Based Approach to Identifying References to Document Entities" has been accepted by the Global Wordnet Conference for a half-hour presentation. Here is the camera-ready version.

2015-06-05: I am pleased to announce that I have accepted an offer for a project scientist position in the School of Computer Science here at Carnegie Mellon University, starting in August. My primary responsibility will be the Usable Privacy Policy Project, a group that I have been involved with since its inception.

2015-04-22: I recently gave talks at John Hopkins University's Human Language Technology Center of Excellence (announcement and slides) and the University of Pennsylvania's Positive Psychology Center (slides).

2015-02-20: I had a photograph in the art gallery of Carnegie Mellon University's SCS Day.

2014-10-07: I've been selected to be a Grand Awards Judge in Computer Science at the Intel International Science and Engineering Fair in May 2015.

2014-09-01: Our submission to HCOMP, titled "Identifying relevant text fragments to help crowdsource privacy policy annotations", was accepted for publication.

2014-08-20: I have moved from the University of Edinburgh to the Language Technologies Institute at Carnegie Mellon University for the second part of my NSF IRFP fellowship. It's been great meeting new colleagues and reconnecting with old colleagues from my previous stay here.

2014-05-09: I gave a talk today for the NLIP Seminar Series at the University of Cambridge. Here are my slides. My recent work at the University of Edinburgh is described in the second half of the presentation.

2014-05-01: My short paper submission to ACL in Baltimore was accepted. The dataset it describes is here.

2013-11-04: Here's a belated link to a press release on the usable privacy policy project that I'm involved with. Also, here's the project website.

2013-10-28: The overview of my research is now up to date.

2013-08-05: I recently arrived at the University of Edinburgh to begin the first part of my NSF International Research Fellowship. It's been great meeting several new colleagues, and I look forward to working here for the next twelve months. Also, in coming months I will be attending Ubicomp in Zurich and IJCNLP in Nagoya to present papers at both.

2013-04-30: I recently gave a talk for the CL+NLP Lunch at Carnegie Mellon. You can take a look at my slides.

2012-11-18: I've updated the overview of my research.

2012-07-02: I've created a stub page to host the metalanguage corpus described in my recent ACL paper.

2012-03-16: My paper "The Creation of a Corpus of English Metalanguage" has been accepted for oral presentation at ACL this July in Jeju, South Korea. I will attend to present it.

2012-02-24: A profile of some of my Ph.D. research is up on

2012-02-12: I've been adding content to the EAPSI page over the past few months, and it is now complete.

2011-10-23: Prompted by my move to Carnegie Mellon last month, I've assembled this long-overdue renovation of my website. I'll add more content in the coming months.