Fall 2023

Standard
Oct. 9
Oct. 16
DBH 4011
1 pm

Marius Kloft

Professor of Computer Science
PTU Kaiserslautern-Landau, Germany

Anomaly detection is one of the fundamental topics in machine learning and artificial intelligence. The aim is to find instances deviating from the norm – so-called ‘anomalies’. Anomalies can be observed in various scenarios, from attacks on computer or energy networks to critical faults in a chemical factory or rare tumors in cancer imaging data. In my talk, I will first introduce the field of anomaly detection, with an emphasis on ‘deep anomaly detection’ (anomaly detection based on deep learning). Then, I will present recent algorithms and theory for deep anomaly detection, with images as primary data type. I will demonstrate how these methods can be better understood using explainable AI methods. I will show new algorithms for deep anomaly detection on other data types, such as time series, graphs, tabular data, and contamined data. Finally, I will close my talk with an outlook on exciting future research directions in anomaly detection and beyond.

Bio: Marius Kloft has worked and researched at various institutions in Germany and the US, including TU Berlin (PhD), UC Berkeley (PhD), NYU (Postdoc), Memorial Sloan-Kettering Cancer Center (Postdoc), HU Berlin (Assist. Prof.), and USC (Visiting Assoc. Prof.). Since 2017, he is a professor of machine learning at RPTU Kaiserslautern-Landau. His research covers a broad spectrum of machine learning, from mathematical theory and fundamental algorithms to applications in medicine and chemical engineering. He received the Google Most Influential Papers 2013 Award, and he is a recipient of the German National Science Foundation’s Emmy-Noether Career Award. In 2022, the paper ‘Deep One-class Classification’ (ICML, 2018) main-authored by Marius Kloft received the ANDEA Test-of-Time Award for the most influential paper in anomaly detection in the last ten years (2012-2022). The paper is highly cited, with around 500 citations per year.
Oct. 23
DBH 4011
1 pm

Sarah Wiegreffe

Postdoctoral Researcher
Allen Institute for AI and University of Washington

Recently-released language models have attracted a lot of attention for their major successes and (often more subtle, but still plentiful) failures. In this talk, I will motivate why transparency into model operations is needed to rectify these failures and increase model utility in a reliable way. I will highlight how techniques must be developed in this changing NLP landscape for both open-source models and black-box models behind an API. I will provide an example of each from my recent work demonstrating how improved transparency can improve language model performance on downstream tasks.

Bio: Sarah Wiegreffe is a young investigator (postdoc) at the Allen Institute for AI (AI2), working on the Aristo project. She also holds a courtesy appointment in the Allen School at the University of Washington. Her research focuses on language model transparency. She received her PhD from Georgia Tech in 2022, during which she interned at Google and AI2. She frequently serves on conference program committees, receiving outstanding area chair award at ACL 2023.
Oct. 30
DBH 4011
1 pm

Noga Zaslavsky

Assistant Professor of Language Science
University of California, Irvine

Our world is extremely complex, and yet we are able to exchange our thoughts and beliefs about it using a relatively small number of words. What computational principles can explain this extraordinary ability? In this talk, I argue that in order to communicate and reason about meaning while operating under limited resources, both humans and machines must efficiently compress their representations of the world. In support of this claim, I present a series of studies showing that: (i) human languages evolve under pressure to efficiently compress meanings into words via the Information Bottleneck (IB) principle; (ii) the same principle can help ground meaning representations in artificial neural networks trained for vision; and (iii) these findings offer a new framework for emergent communication in artificial agents. Taken together, these results suggest that efficient compression underlies meaning in language and offer a new approach to guiding artificial agents toward human-like communication without relying on massive amounts of human-generated training data.

Bio: Noga Zaslavsky is an Assistant Professor in UCI’s Language Science department. Before joining UCI this year, she was a postdoctoral fellow at MIT. She holds a Ph.D. (2020) in Computational Neuroscience from the Hebrew University, and during her graduate studies she was also affiliated with UC Berkeley. Her research aims to understand the computational principles that underlie language and cognition by integrating methods from machine learning, information theory, and cognitive science. Her work has been recognized by several awards, including a K. Lisa Yang Integrative Computational Neuroscience Postdoctoral Fellowship, an IBM Ph.D. Fellowship Award, and a 2018 Computational Modeling Prize from the Cognitive Science Society.
Nov. 6
DBH 4011
1 pm

Mariel Werner

PhD Student
Department of Electrical Engineering and Computer Science, UC Berkeley

I will be discussing my recent work on personalization in federated learning. Federated learning is a powerful distributed optimization framework in which multiple clients collaboratively train a global model without sharing their raw data. In this work, we tackle the personalized version of the federated learning problem. In particular, we ask: throughout the training process, can clients identify a subset of similar clients and collaboratively train with just those clients? In the affirmative, we propose simple clustering-based methods which are provably optimal for a broad class of loss functions (the first such guarantees), are robust to malicious attackers, and perform well in practice.

Bio: Mariel Werner is a 5th-year PhD student in the Department of Electrical Engineering and Computer Science at UC Berkeley advised by Michael I. Jordan. Her research focus is federated learning, with a particular interest in economic applications. Currently, she is working on designing data-sharing mechanisms for firms in oligopolistic markets, motivated by ideas from federated learning. Recently, she has also been studying dynamics of privacy and reputation-building in principal-agent interactions. Mariel holds an undergraduate degree in Applied Mathematics from Harvard University.
Nov. 13
DBH 4011
1 pm

Yian Ma

Assistant Professor, Halıcıoğlu Data Science Institute
University of California, San Diego

I will introduce some recent progress towards understanding the scalability of Markov chain Monte Carlo (MCMC) methods and their comparative advantage with respect to variational inference. I will fact-check the folklore that “variational inference is fast but biased, MCMC is unbiased but slow”. I will then discuss a combination of the two via reverse diffusion, which holds promise of solving some of the multi-modal problems. This talk will be motivated by the need for Bayesian computation in reinforcement learning problems as well as the differential privacy requirements that we face.

Bio: Yian Ma is an assistant professor at the Halıcıoğlu Data Science Institute and an affiliated faculty member at the Computer Science and Engineering Department of UC San Diego. Prior to UCSD, he spent a year as a visiting faculty at Google Research. Before that, he was a post-doctoral fellow at UC Berkeley, hosted by Mike Jordan. Yian completed his Ph.D. at University of Washington. His current research primarily revolves around scalable inference methods for credible machine learning, with application to time series data and sequential decision making tasks. He has received the Facebook research award, and the best paper award at the NeurIPS AABI symposium.
Nov. 20
DBH 4011
1 pm

Yuhua Zhu

Assistant Professor, Halicioglu Data Science Institute and Dept. of Mathematics
University of California, San Diego

In this talk, I will build the connection between Hamilton-Jacobi-Bellman equations (HJB) and the multi-armed bandit (MAB) problems. HJB is an important equation in solving stochastic optimal control problems. MAB is a widely used paradigm for studying the exploration-exploitation trade-off in sequential decision making under uncertainty. This is the first work that establishes this connection in a general setting. I will present an efficient algorithm for solving MAB problems based on this connection and demonstrate its practical applications. This is a joint work with Lexing Ying and Zach Izzo from Stanford University.

Bio: Yuhua Zhu is an assistant professor at UC San Diego, where she holds a joint appointment in the Halicioğlu Data Science Institute (HDSI) and the Department of Mathematics. Previously, she was a Postdoctoral Fellow at Stanford University mentored by Lexing Ying. She earned her Ph.D. from UW-Madison in 2019 advised by Shi Jin, and she obtained her BS in Mathematics from SJTU in 2014. Her work builds the bridge between differential equations and machine learning, spanning the areas of reinforcement learning, stochastic optimization, sequential decision-making, and uncertainty quantification.
Nov. 21
DBH 4011
11 am

Yejin Choi

Wissner-Slivka Professor of Computer Science and & Engineering
University of Washington and Allen Institute for Artificial Intelligence

In this talk, I will question if there can be possible impossibilities of large language models (i.e., the fundamental limits of transformers, if any) and the impossible possibilities of language models (i.e., seemingly impossible alternative paths beyond scale, if at all).

Bio: Yejin Choi is Wissner-Slivka Professor and a MacArthur Fellow at the Paul G. Allen School of Computer Science & Engineering at the University of Washington. She is also a senior director at AI2 overseeing the project Mosaic and a Distinguished Research Fellow at the Institute for Ethics in AI at the University of Oxford. Her research investigates if (and how) AI systems can learn commonsense knowledge and reasoning, if machines can (and should) learn moral reasoning, and various other problems in NLP, AI, and Vision including neuro-symbolic integration, language grounding with vision and interactions, and AI for social good. She is a co-recipient of 2 Test of Time Awards (at ACL 2021 and ICCV 2021), 7 Best/Outstanding Paper Awards (at ACL 2023, NAACL 2022, ICML 2022, NeurIPS 2021, AAAI 2019, and ICCV 2013), the Borg Early Career Award (BECA) in 2018, the inaugural Alexa Prize Challenge in 2017, and IEEE AI’s 10 to Watch in 2016.
Nov. 27
DBH 4011
1 pm

Tryphon Georgiou

Distinguished Professor of Mechanical and Aerospace Engineering
University of California, Irvine

The energetic cost of information erasure and of energy transduction can be cast as the stochastic problem to minimize entropy production during thermodynamic transitions. This formalism of Stochastic Thermodynamics allows quantitative assessment of work exchange and entropy production for systems that are far from equilibrium. In the talk we will highlight the cost of Landauer’s bit-erasure in finite time and explain how to obtain bounds the performance of Carnot-like thermodynamic engines and of processes that are powered by thermal anisotropy. The talk will be largely based on joint work with Olga Movilla Miangolarra, Amir Taghvaei, Rui Fu, and Yongxin Chen.

Bio: Tryphon T. Georgiou was educated at the National Technical University of Athens, Greece (1979) and the University of Florida, Gainesville (PhD 1983). He is currently a Distinguished Professor at the Department of Mechanical and Aerospace Engineering, University of California, Irvine. He is a Fellow of IEEE, SIAM, IFAC, AAAS and a Foreign Member of the Royal Swedish Academy of Engineering Sciences (IVA).
Dec. 4
DBH 4011
1 pm

Deying Kong

Software Engineer, Google

Despite its extensive range of potential applications in virtual reality and augmented reality, 3D interacting hand pose estimation from RGB image remains a very challenging problem, due to appearance confusions between keypoints of the two hands, and severe hand-hand occlusion. Due to their ability to capture long range relationships between keypoints, transformer-based methods have gained popularity in the research community. However, the existing methods usually deploy tokens at keypoint level, which inevitably results in high computational and memory complexity. In this talk, we will propose a simple yet novel mechanism, i.e., hand-level tokenization, in our transformer based model, where we deploy only one token for each hand. With this novel design, we will also propose a pose query enhancer module, which can refine the pose prediction iteratively, by focusing on features guided by previous coarse pose predictions. As a result, our proposed model, Handformer2T, can achieve high performance while remaining lightweight.

Bio: Deying Kong currently is a software engineer from Google Inc. He earned his PhD in Computer Science from University of California, Irvine in 2022, under the supervision of Professor Xiaohui Xie. His research interests mainly focus on computer vision, especially hand/human pose estimation.
Dec. 11
No Seminar (Finals Week and NeurIPS Conference)