AI/ML Seminar Series

Standard

Weekly Seminar in AI & Machine Learning
Sponsored by Cylance

Apr. 10
DBH 4011
1 pm

Durk Kingma

Research Scientist
Google Research

Some believe that maximum likelihood is incompatible with high-quality image generation. We provide counter-evidence: diffusion models with SOTA FIDs (e.g. https://arxiv.org/abs/2301.11093) are actually optimized with the ELBO, with very simple data augmentation (additive noise). First, we show that diffusion models in the literature are optimized with various objectives that are special cases of a weighted loss, where the weighting function specifies the weight per noise level. Uniform weighting corresponds to maximizing the ELBO, a principled approximation of maximum likelihood. In current practice diffusion models are optimized with non-uniform weighting due to better results in terms of sample quality. In this work we expose a direct relationship between the weighted loss (with any weighting) and the ELBO objective. We show that the weighted loss can be written as a weighted integral of ELBOs, with one ELBO per noise level. If the weighting function is monotonic, as in some SOTA models, then the weighted loss is a likelihood-based objective: it maximizes the ELBO under simple data augmentation, namely Gaussian noise perturbation. Our main contribution is a deeper theoretical understanding of the diffusion objective, but we also performed some experiments comparing monotonic with non-monotonic weightings, finding that monotonic weighting performs competitively with the best published results.

Bio: I do research on principled and scalable methods for machine learning, with a focus on generative models. My contributions include the Variational Autoencoder (VAE), the Adam optimizer, Glow, and Variational Diffusion Models, but please see Scholar for a more complete list. I obtained a PhD (cum laude) from University of Amsterdam in 2017, and was part of the founding team of OpenAI in 2015. Before that, I co-founded Advanza which got acquired in 2016. My formal name is Diederik, but have the Frysian nickname Durk (pronounced like Dirk). I currently live in the San Francisco Bay area.
Apr. 17
DBH 4011
1 pm

Danish Pruthi

Assistant Professor
Department of Computational and Data Sciences (CDS)
Indian Institute of Science (IISc), Bangalore

TBA.

Bio: I received my PhD from CMU, where I was co-advised by Zachary C. Lipton and Graham Neubig. My doctoral research focused on addressing issues concerning the interpretability of deep learning models. I completed my bachelors degree in computer science from BITS Pilani, Pilani in 2015. I’ve also spent time doing research at Google AI, Microsoft Research, Facebook AI Research, and Amazon AI. I am a recipient of the Siebel Scholarship and the CMU Presidential Fellowship.
Apr. 24
DBH 4011
1 pm

Anthony Chen

PhD Student
Department of Computer Science, UC Irvine

TBA.

Bio: I am a final-year Ph.D. student working on machine learning and language understanding at UC Irvine, advised by Sameer Singh, and a research intern at Google hosted by Hongrae Lee and Kelvin Guu. I was a research intern at Verneek and Apple. I’ve also had the fortune to collaborate with Gabi Stanovsky and Matt Gardner. My research focuses on how we can evaluate the limits of large language models and design efficient methods to address their deficiencies. Recently, I’ve been tackling the pernicious problem of attribution and hallucinations in language models, such as understanding the cause of and removing hallucinations from the outputs of large language models, making them more reliable to use.
May 1
DBH 4011
1 pm

Hengrui Cai

Assistant Professor of Statistics
University of California, Irvine

The causal revolution has spurred interest in understanding complex relationships in various fields. Under a general causal graph, the exposure may have a direct effect on the outcome and also an indirect effect regulated by a set of mediators. An analysis of causal effects that interprets the causal mechanism contributed through mediators is hence challenging but on demand. In this talk, we introduce a new statistical framework to comprehensively characterize causal effects with multiple mediators, namely, ANalysis Of Causal Effects (ANOCE). Built upon such causal impact learning, we focus on two emerging challenges in causal relation learning, heterogeneity and spuriousness. To characterize the heterogeneity, we first conceptualize heterogeneous causal graphs (HCGs) by generalizing the causal graphical model with confounder-based interactions and multiple mediators. In practice, only a small number of variables in the graph are relevant for the outcomes of interest. As a result, causal estimation with the full causal graph — especially given limited data — could lead to many falsely discovered, spurious variables that may be highly correlated with but have no causal impact on the target outcome. We propose to learn a class of necessary and sufficient causal graphs (NSCG) that only contain causally relevant variables by utilizing the probabilities of causation. Across empirical studies of simulated and real data applications, we show that the proposed algorithms outperform existing ones and can reveal true heterogeneous and non-spurious causal graphs.

Bio: Dr. Hengrui Cai is an Assistant Professor in the Department of Statistics at the University of California Irvine. She obtained her Ph.D. degree in Statistics at North Carolina State University in 2022. Cai has broad research interests in methodology and theory in causal inference, reinforcement learning, and graphical modeling, to establish reliable, powerful, and interpretable solutions to real-world problems. Currently, her research focuses on causal inference and causal structure learning, and policy optimization and evaluation in reinforcement/deep learning. Her work has been published in conferences including ICLR, NeurIPS, ICML, and IJCAI, as well as journals including the Journal of Machine Learning Research, Stat, and Statistics in Medicine.
May 8
DBH 4011
1 pm

Pierre Baldi and Alexander Shmakov

Department of Computer Science, UC Irvine

The Baldi group will present ongoing progress in the theory and applications of deep learning. On the theory side, we will discuss homogeneous activation functions and their important connections to the concept of generalized neural balance. On the application side, we will present applications of neural transformers to physics, in particular for the assignment of observation measurements to the leaves of partial Feynman diagrams in particle physics. In these applications, the permutation invariance properties of transformers are used to capture fundamental symmetries (e.g. matter vs antimatter) in the laws of physics.

Bio: Pierre Baldi earned M.S. degrees in mathematics and psychology from the University of Paris, France, in 1980, and the Ph.D. degree in mathematics from the Caltech, CA, USA, in 1986. He is currently a Distinguished Professor with the Department of Computer Science, Director with the Institute for Genomics and Bioinformatics, and Associate Director with the Center for Machine Learning and Intelligent Systems at the University of California, Irvine, CA, USA. His research interests include understanding intelligence in brains and machines. He has made several contributions to the theory of deep learning, and developed and applied deep learning methods for problems in the natural sciences. He has written 4 books and over 300 peer-reviewed articles. Dr. Baldi was the recipient of the 1993 Lew Allen Award at JPL, the 2010 E. R. Caianiello Prize for research in machine learning, and a 2014 Google Faculty Research Award. He is an Elected Fellow of the AAAS, AAAI, IEEE, ACM, and ISCB Alexander Shmakov is a Ph.D. student in the Baldi research group who loves everything deep learning and robotics. He has published papers on applications of deep learning to planning, robotic control, high energy physics, astronomy, chemical synthesis, and biology.
May 15
DBH 4011
1 pm

Guy Van den Broeck

Associate Professor of Computer Science
University of California, Los Angeles

Many expect that AI will go from powering chatbots to providing mental health services. That it will go from advertisement to deciding who is given bail. The expectation is that AI will solve society’s problems by simply being more intelligent than we are. Implicit in this bullish perspective is the assumption that AI will naturally learn to reason from data: that it can form trains of thought that “make sense”, similar to how a mental health professional or judge might reason about a case, or more formally, how a mathematician might prove a theorem. This talk will investigate the question whether this behavior can be learned from data, and how we can design the next generation of AI techniques that can achieve such capabilities, focusing on neuro-symbolic learning and tractable deep generative models.

Bio: Guy Van den Broeck is an Associate Professor and Samueli Fellow at UCLA, in the Computer Science Department, where he directs the StarAI lab. His research interests are in Machine Learning, Knowledge Representation and Reasoning, and Artificial Intelligence in general. His papers have been recognized with awards from key conferences such as AAAI, UAI, KR, OOPSLA, and ILP. Guy is the recipient of an NSF CAREER award, a Sloan Fellowship, and the IJCAI-19 Computers and Thought Award.
May 22
TBA
May 29
No Seminar (Memorial Day)
June 5
DBH 4011
1 pm

Sangeetha Jothi

Assistant Professor of Computer Science
University of California, Irvine

TBA

Bio: I am an Assistant Professor in the Computer Science department at the University of California, Irvine. My research interests lie at the intersection of computer systems, networking, and machine learning. Prior to UCI, I completed my Ph.D. at the University of Illinois, Urbana-Champaign in 2019 where I was advised by Brighten Godfrey and had a brief stint as a postdoc at VMware Research. I am currently an Affiliated Researcher at VMware Research. I lead the Networking, Systems, and AI Lab (NetSAIL) at UCI. My current focus revolves around: Internet and Cloud Resilience, and Systems and Machine Learning.