About Me

Who I Am

I (he/him) am an Assistant Professor in the College of Information Sciences and Technology at Penn State, where I lead the Human Language Technologies Lab. My research spans natural language processing, privacy, security, and computational social science.

From 2016 until 2018 I was an Assistant Professor in the EECS Department at the University of Cincinnati. Prior to that I was a postdoc and a lecturer in Carnegie Mellon University's School of Computer Science and an NSF International Research Fellow in the University of Edinburgh's School of Informatics. I received my PhD in Computer Science from the University of Maryland in 2011.

Contacting Me

You can reach me at shomir _at_ psu.edu. If you are on Penn State's University Park campus, you can stop by my office at E310 in the Westgate Building.

Students taking my classes may benefit from browsing my Guide for Interacting With Faculty before contacting me.

Students interested in joining my lab: I sometimes have openings for PhD students, MS students, and undergraduates to work on projects related to natural language processing or privacy. If you're interested in working with me, first consult my Guide for Joining My Lab and then email me with "read your recruiting note" as the subject line. Include a CV and an explanation of your specific interests in my research.

Curriculum Vitae

Here it is.

Latest News

I also sometimes post news and thoughts to Twitter.

2022-04-06: We have two papers accepted to LREC: "STAPI: An Automatic Scraper for Extracting Iterative Title-Text Structure from Web Documents" and "A Tale of Two Regulatory Regimes: Creation and Analysis of a Bilingual Privacy Policy Corpus".

2022-02-04: I've been elected to Penn State's University Faculty Senate for a four-year term, starting in Fall 2022 and ending in Spring 2026.

2022-01-15: Our paper "Automated Detection of Doxing on Twitter" has been accepted to CSCW 2022. Here's a preprint on arXiv.

For older news, check the archive.