CyLab Seminar - Daphne Ippolito February 3, 2025 12:00pm — 1:00pm Location: In Person and Virtual - ET - Panther Hollow Conference Room, Mehrabian Collaborative Innovation Center 4105, and Zoom Speaker: DAPHNE IPPOLITO, Assistant Professor, Language Technologies Institute, Carnegie Mellon University, and Senior Research Scientist, Google Deepmind https://www.daphnei.com/ Modern large language models (LLMs) derive their capabilities from the data used to train their underlying neural networks. While this data is the source of LLMs’ strength, it also creates fallibilities. Though the companies releasing LLMs aim to hide their training data from users, we demonstrate how it is surprisingly difficult to keep malicious, or even typical users, from accessing long strings of text that LLMs have memorized from the source data. Furthermore, most training data is derived from large-scale crawls of the Internet. We investigate whether by poisoning portions of the Internet, an adversary can insert backdoors or otherwise change the behavior of the LLMs trained on this data. — Daphne Ippolito is an assistant professor at the Language Technologies Institute at Carnegie Mellon University and a senior research scientist at Google Deepmind. Among other topics, she studies privacy and security issues around language generation systems, strategies for better evaluation of language models, and customizability of language models for different real-world applications. In Person and Zoom Participation. See announcement. Event Website: https://www.cylab.cmu.edu/events/2025/02/03-seminar-ippolito.html Add event to Google Add event to iCal