AI Seminar - Samy Bengio

— 1:00pm

Location:
In Person and Virtual - ET - ASA Conference Room, Gates Hillman 6115 and Zoom

Speaker:
SAMY BENGIO, Senior Director, AI and Machine Learning Research, Apple
http://bengio.abracadoudou.com

Can Transformers predict new syllogisms by composing established ones?  More generally, what type of targets can be learned by such models from scratch? Recent works show that Transformers can be Turing-complete in terms of expressivity, but this does not address the learnability objective. 

This presentation puts forward the notion of 'globality degree' to capture when weak learning is efficiently achievable by regular Transformers, where the latter measures the least number of tokens required in addition to the tokens histogram to correlate nontrivially with the target. As shown experimentally and theoretically under additional assumptions, distributions with high globality cannot be learned efficiently. 

In particular, syllogisms cannot be composed on long chains. Furthermore, we show that (i) an agnostic scratchpad cannot help to break the globality barrier, (ii) an educated scratchpad can help if it breaks the globality barrier at each step, (iii) a notion of 'inductive scratchpad' can both break the globality barrier and improve the out-of-distribution generalization, e.g., generalizing to almost double input size for some arithmetic tasks. 

— 

Samy Bengio (PhD in computer science, University of Montreal, 1993) is a senior director of machine learning research at Apple since 2021 and an adjunct professor at EPFL since 2024. Before that, he was a distinguished scientist at Google Research since 2007 where he was heading part of the Google Brain team, and at IDIAP in the early 2000s where he co-wrote the well-known open-source Torch machine learning library. 

His research interests span many areas of machine learning such as deep architectures, representation learning, vision and language processing and more recently, reasoning. He is action editor of the Journal of Machine Learning Research and on the board of the NeurIPS foundation. 

He was on the editorial board of the Machine Learning Journal, has been program chair (2017) and general chair (2018) of NeurIPS, program chair of ICLR (2015, 2016), general chair of BayLearn (2012-2015), MLMI (2004-2006), as well as NNSP (2002), and on the program committee of several international conferences such as NeurIPS, ICML, ICLR, ECML and IJCAI. 

In Person and Zoom Participation.  See announcement.

Event Website:
http://www.cs.cmu.edu/~aiseminar/


Add event to Google
Add event to iCal