New NSF AI Grant for Prof Rina Dechter

Standard

Computer Science Professor Rina Dechter of UC Irvine’s Donald Bren School of Information and Computer Sciences (ICS) is one of five co-principal investigators for a $5 million, multi-institutional National Science Foundation grant titled “Causal Foundations of Decision Making and Learning.” The grant, which aims to revolutionize AI decision-making by advancing the science of causal inference is a collaboration between UCI, Columbia University, and USC.

Fall 2023

Standard
Oct. 9
Oct. 16
DBH 4011
1 pm

Marius Kloft

Professor of Computer Science
PTU Kaiserslautern-Landau, Germany

Anomaly detection is one of the fundamental topics in machine learning and artificial intelligence. The aim is to find instances deviating from the norm – so-called ‘anomalies’. Anomalies can be observed in various scenarios, from attacks on computer or energy networks to critical faults in a chemical factory or rare tumors in cancer imaging data. In my talk, I will first introduce the field of anomaly detection, with an emphasis on ‘deep anomaly detection’ (anomaly detection based on deep learning). Then, I will present recent algorithms and theory for deep anomaly detection, with images as primary data type. I will demonstrate how these methods can be better understood using explainable AI methods. I will show new algorithms for deep anomaly detection on other data types, such as time series, graphs, tabular data, and contamined data. Finally, I will close my talk with an outlook on exciting future research directions in anomaly detection and beyond.

Bio: Marius Kloft has worked and researched at various institutions in Germany and the US, including TU Berlin (PhD), UC Berkeley (PhD), NYU (Postdoc), Memorial Sloan-Kettering Cancer Center (Postdoc), HU Berlin (Assist. Prof.), and USC (Visiting Assoc. Prof.). Since 2017, he is a professor of machine learning at RPTU Kaiserslautern-Landau. His research covers a broad spectrum of machine learning, from mathematical theory and fundamental algorithms to applications in medicine and chemical engineering. He received the Google Most Influential Papers 2013 Award, and he is a recipient of the German National Science Foundation’s Emmy-Noether Career Award. In 2022, the paper ‘Deep One-class Classification’ (ICML, 2018) main-authored by Marius Kloft received the ANDEA Test-of-Time Award for the most influential paper in anomaly detection in the last ten years (2012-2022). The paper is highly cited, with around 500 citations per year.
Oct. 23
DBH 4011
1 pm

Sarah Wiegreffe

Postdoctoral Researcher
Allen Institute for AI and University of Washington

Recently-released language models have attracted a lot of attention for their major successes and (often more subtle, but still plentiful) failures. In this talk, I will motivate why transparency into model operations is needed to rectify these failures and increase model utility in a reliable way. I will highlight how techniques must be developed in this changing NLP landscape for both open-source models and black-box models behind an API. I will provide an example of each from my recent work demonstrating how improved transparency can improve language model performance on downstream tasks.

Bio: Sarah Wiegreffe is a young investigator (postdoc) at the Allen Institute for AI (AI2), working on the Aristo project. She also holds a courtesy appointment in the Allen School at the University of Washington. Her research focuses on language model transparency. She received her PhD from Georgia Tech in 2022, during which she interned at Google and AI2. She frequently serves on conference program committees, receiving outstanding area chair award at ACL 2023.
Oct. 30
DBH 4011
1 pm

Noga Zaslavsky

Assistant Professor of Language Science
University of California, Irvine

Our world is extremely complex, and yet we are able to exchange our thoughts and beliefs about it using a relatively small number of words. What computational principles can explain this extraordinary ability? In this talk, I argue that in order to communicate and reason about meaning while operating under limited resources, both humans and machines must efficiently compress their representations of the world. In support of this claim, I present a series of studies showing that: (i) human languages evolve under pressure to efficiently compress meanings into words via the Information Bottleneck (IB) principle; (ii) the same principle can help ground meaning representations in artificial neural networks trained for vision; and (iii) these findings offer a new framework for emergent communication in artificial agents. Taken together, these results suggest that efficient compression underlies meaning in language and offer a new approach to guiding artificial agents toward human-like communication without relying on massive amounts of human-generated training data.

Bio: Noga Zaslavsky is an Assistant Professor in UCI’s Language Science department. Before joining UCI this year, she was a postdoctoral fellow at MIT. She holds a Ph.D. (2020) in Computational Neuroscience from the Hebrew University, and during her graduate studies she was also affiliated with UC Berkeley. Her research aims to understand the computational principles that underlie language and cognition by integrating methods from machine learning, information theory, and cognitive science. Her work has been recognized by several awards, including a K. Lisa Yang Integrative Computational Neuroscience Postdoctoral Fellowship, an IBM Ph.D. Fellowship Award, and a 2018 Computational Modeling Prize from the Cognitive Science Society.
Nov. 6
DBH 4011
1 pm

Mariel Werner

PhD Student
Department of Electrical Engineering and Computer Science, UC Berkeley

I will be discussing my recent work on personalization in federated learning. Federated learning is a powerful distributed optimization framework in which multiple clients collaboratively train a global model without sharing their raw data. In this work, we tackle the personalized version of the federated learning problem. In particular, we ask: throughout the training process, can clients identify a subset of similar clients and collaboratively train with just those clients? In the affirmative, we propose simple clustering-based methods which are provably optimal for a broad class of loss functions (the first such guarantees), are robust to malicious attackers, and perform well in practice.

Bio: Mariel Werner is a 5th-year PhD student in the Department of Electrical Engineering and Computer Science at UC Berkeley advised by Michael I. Jordan. Her research focus is federated learning, with a particular interest in economic applications. Currently, she is working on designing data-sharing mechanisms for firms in oligopolistic markets, motivated by ideas from federated learning. Recently, she has also been studying dynamics of privacy and reputation-building in principal-agent interactions. Mariel holds an undergraduate degree in Applied Mathematics from Harvard University.
Nov. 13
DBH 4011
1 pm

Yian Ma

Assistant Professor, Halıcıoğlu Data Science Institute
University of California, San Diego

I will introduce some recent progress towards understanding the scalability of Markov chain Monte Carlo (MCMC) methods and their comparative advantage with respect to variational inference. I will fact-check the folklore that “variational inference is fast but biased, MCMC is unbiased but slow”. I will then discuss a combination of the two via reverse diffusion, which holds promise of solving some of the multi-modal problems. This talk will be motivated by the need for Bayesian computation in reinforcement learning problems as well as the differential privacy requirements that we face.

Bio: Yian Ma is an assistant professor at the Halıcıoğlu Data Science Institute and an affiliated faculty member at the Computer Science and Engineering Department of UC San Diego. Prior to UCSD, he spent a year as a visiting faculty at Google Research. Before that, he was a post-doctoral fellow at UC Berkeley, hosted by Mike Jordan. Yian completed his Ph.D. at University of Washington. His current research primarily revolves around scalable inference methods for credible machine learning, with application to time series data and sequential decision making tasks. He has received the Facebook research award, and the best paper award at the NeurIPS AABI symposium.
Nov. 20
DBH 4011
1 pm

Yuhua Zhu

Assistant Professor, Halicioglu Data Science Institute and Dept. of Mathematics
University of California, San Diego

In this talk, I will build the connection between Hamilton-Jacobi-Bellman equations (HJB) and the multi-armed bandit (MAB) problems. HJB is an important equation in solving stochastic optimal control problems. MAB is a widely used paradigm for studying the exploration-exploitation trade-off in sequential decision making under uncertainty. This is the first work that establishes this connection in a general setting. I will present an efficient algorithm for solving MAB problems based on this connection and demonstrate its practical applications. This is a joint work with Lexing Ying and Zach Izzo from Stanford University.

Bio: Yuhua Zhu is an assistant professor at UC San Diego, where she holds a joint appointment in the Halicioğlu Data Science Institute (HDSI) and the Department of Mathematics. Previously, she was a Postdoctoral Fellow at Stanford University mentored by Lexing Ying. She earned her Ph.D. from UW-Madison in 2019 advised by Shi Jin, and she obtained her BS in Mathematics from SJTU in 2014. Her work builds the bridge between differential equations and machine learning, spanning the areas of reinforcement learning, stochastic optimization, sequential decision-making, and uncertainty quantification.
Nov. 21
DBH 4011
11 am

Yejin Choi

Wissner-Slivka Professor of Computer Science and & Engineering
University of Washington and Allen Institute for Artificial Intelligence

In this talk, I will question if there can be possible impossibilities of large language models (i.e., the fundamental limits of transformers, if any) and the impossible possibilities of language models (i.e., seemingly impossible alternative paths beyond scale, if at all).

Bio: Yejin Choi is Wissner-Slivka Professor and a MacArthur Fellow at the Paul G. Allen School of Computer Science & Engineering at the University of Washington. She is also a senior director at AI2 overseeing the project Mosaic and a Distinguished Research Fellow at the Institute for Ethics in AI at the University of Oxford. Her research investigates if (and how) AI systems can learn commonsense knowledge and reasoning, if machines can (and should) learn moral reasoning, and various other problems in NLP, AI, and Vision including neuro-symbolic integration, language grounding with vision and interactions, and AI for social good. She is a co-recipient of 2 Test of Time Awards (at ACL 2021 and ICCV 2021), 7 Best/Outstanding Paper Awards (at ACL 2023, NAACL 2022, ICML 2022, NeurIPS 2021, AAAI 2019, and ICCV 2013), the Borg Early Career Award (BECA) in 2018, the inaugural Alexa Prize Challenge in 2017, and IEEE AI’s 10 to Watch in 2016.
Nov. 27
DBH 4011
1 pm

Tryphon Georgiou

Distinguished Professor of Mechanical and Aerospace Engineering
University of California, Irvine

The energetic cost of information erasure and of energy transduction can be cast as the stochastic problem to minimize entropy production during thermodynamic transitions. This formalism of Stochastic Thermodynamics allows quantitative assessment of work exchange and entropy production for systems that are far from equilibrium. In the talk we will highlight the cost of Landauer’s bit-erasure in finite time and explain how to obtain bounds the performance of Carnot-like thermodynamic engines and of processes that are powered by thermal anisotropy. The talk will be largely based on joint work with Olga Movilla Miangolarra, Amir Taghvaei, Rui Fu, and Yongxin Chen.

Bio: Tryphon T. Georgiou was educated at the National Technical University of Athens, Greece (1979) and the University of Florida, Gainesville (PhD 1983). He is currently a Distinguished Professor at the Department of Mechanical and Aerospace Engineering, University of California, Irvine. He is a Fellow of IEEE, SIAM, IFAC, AAAS and a Foreign Member of the Royal Swedish Academy of Engineering Sciences (IVA).
Dec. 4
DBH 4011
1 pm

Deying Kong

Software Engineer, Google

Despite its extensive range of potential applications in virtual reality and augmented reality, 3D interacting hand pose estimation from RGB image remains a very challenging problem, due to appearance confusions between keypoints of the two hands, and severe hand-hand occlusion. Due to their ability to capture long range relationships between keypoints, transformer-based methods have gained popularity in the research community. However, the existing methods usually deploy tokens at keypoint level, which inevitably results in high computational and memory complexity. In this talk, we will propose a simple yet novel mechanism, i.e., hand-level tokenization, in our transformer based model, where we deploy only one token for each hand. With this novel design, we will also propose a pose query enhancer module, which can refine the pose prediction iteratively, by focusing on features guided by previous coarse pose predictions. As a result, our proposed model, Handformer2T, can achieve high performance while remaining lightweight.

Bio: Deying Kong currently is a software engineer from Google Inc. He earned his PhD in Computer Science from University of California, Irvine in 2022, under the supervision of Professor Xiaohui Xie. His research interests mainly focus on computer vision, especially hand/human pose estimation.
Dec. 11
No Seminar (Finals Week and NeurIPS Conference)

Spring 2023

Standard
Apr. 10
DBH 4011
1 pm

Durk Kingma

Research Scientist
Google Research

Some believe that maximum likelihood is incompatible with high-quality image generation. We provide counter-evidence: diffusion models with SOTA FIDs (e.g. https://arxiv.org/abs/2301.11093) are actually optimized with the ELBO, with very simple data augmentation (additive noise). First, we show that diffusion models in the literature are optimized with various objectives that are special cases of a weighted loss, where the weighting function specifies the weight per noise level. Uniform weighting corresponds to maximizing the ELBO, a principled approximation of maximum likelihood. In current practice diffusion models are optimized with non-uniform weighting due to better results in terms of sample quality. In this work we expose a direct relationship between the weighted loss (with any weighting) and the ELBO objective. We show that the weighted loss can be written as a weighted integral of ELBOs, with one ELBO per noise level. If the weighting function is monotonic, as in some SOTA models, then the weighted loss is a likelihood-based objective: it maximizes the ELBO under simple data augmentation, namely Gaussian noise perturbation. Our main contribution is a deeper theoretical understanding of the diffusion objective, but we also performed some experiments comparing monotonic with non-monotonic weightings, finding that monotonic weighting performs competitively with the best published results.

Bio: I do research on principled and scalable methods for machine learning, with a focus on generative models. My contributions include the Variational Autoencoder (VAE), the Adam optimizer, Glow, and Variational Diffusion Models, but please see Scholar for a more complete list. I obtained a PhD (cum laude) from University of Amsterdam in 2017, and was part of the founding team of OpenAI in 2015. Before that, I co-founded Advanza which got acquired in 2016. My formal name is Diederik, but have the Frysian nickname Durk (pronounced like Dirk). I currently live in the San Francisco Bay area.
Apr. 17
DBH 4011
1 pm

Danish Pruthi

Assistant Professor
Department of Computational and Data Sciences (CDS)
Indian Institute of Science (IISc), Bangalore

While large deep learning models have become increasingly accurate, concerns about their (lack of) interpretability have taken a center stage. In response, a growing subfield on interpretability and analysis of these models has emerged. While hundreds of techniques have been proposed to “explain” predictions of models, what aims these explanations serve and how they ought to be evaluated are often unstated. In this talk, I will present a framework to quantify the value of explanations, along with specific applications in a variety of contexts. I would end with some of my thoughts on evaluating large language models and the rationales they generate.

Bio: Danish Pruthi is an incoming assistant professor at the Indian Institute of Science (IISc), Bangalore. He received his Ph.D. from the School of Computer Science at Carnegie Mellon University, where he was advised by Graham Neubig and Zachary Lipton. He is broadly interested in the areas of natural language processing and deep learning, with a focus on model interpretability. He completed his bachelors degree in computer science from BITS Pilani, Pilani. He has spent time doing research at Google AI, Facebook AI Research, Microsoft Research, Amazon AI and IISc. He is also a recipient of the Siebel Scholarship and the CMU Presidential Fellowship. His legal name is only Danish—a cause of airport quagmires and, in equal parts, funny anecdotes.
Apr. 24
DBH 4011
1 pm

Anthony Chen

PhD Student
Department of Computer Science, UC Irvine

As the strengths of large language models (LLMs) have become prominent, so too have their weaknesses. A glaring weakness of LLMs is their penchant for generating false, biased, or misleading claims in a phenomena broadly referred to as “hallucinations”. Most LLMs also do not ground their generations to any source, exacerbating this weakness. To enable attribution while still preserving all the powerful advantages of LLMs, we propose RARR (Retrofit Attribution using Research and Revision), a system that 1) automatically retrieves evidence to support the output of any LLM followed by 2) post-editing the output to fix any information that contradicts the retrieved evidence while preserving the original output as much as possible. When applied to the output of several state-of-the-art LLMs on a diverse set of generation tasks, we find that RARR significantly improves attribution.

Bio: Anthony Chen is a final-year doctoral student advised by Sameer Singh. He is broadly interested in how we can evaluate the limits of large language models and design efficient methods to address their deficiencies. Recently, his research has been focused on tackling the pernicious problem of attribution and hallucinations in large language models and making them more reliable to use.
May 1
DBH 4011
1 pm

Hengrui Cai

Assistant Professor of Statistics
University of California, Irvine

The causal revolution has spurred interest in understanding complex relationships in various fields. Under a general causal graph, the exposure may have a direct effect on the outcome and also an indirect effect regulated by a set of mediators. An analysis of causal effects that interprets the causal mechanism contributed through mediators is hence challenging but on demand. In this talk, we introduce a new statistical framework to comprehensively characterize causal effects with multiple mediators, namely, ANalysis Of Causal Effects (ANOCE). Built upon such causal impact learning, we focus on two emerging challenges in causal relation learning, heterogeneity and spuriousness. To characterize the heterogeneity, we first conceptualize heterogeneous causal graphs (HCGs) by generalizing the causal graphical model with confounder-based interactions and multiple mediators. In practice, only a small number of variables in the graph are relevant for the outcomes of interest. As a result, causal estimation with the full causal graph — especially given limited data — could lead to many falsely discovered, spurious variables that may be highly correlated with but have no causal impact on the target outcome. We propose to learn a class of necessary and sufficient causal graphs (NSCG) that only contain causally relevant variables by utilizing the probabilities of causation. Across empirical studies of simulated and real data applications, we show that the proposed algorithms outperform existing ones and can reveal true heterogeneous and non-spurious causal graphs.

Bio: Dr. Hengrui Cai is an Assistant Professor in the Department of Statistics at the University of California Irvine. She obtained her Ph.D. degree in Statistics at North Carolina State University in 2022. Cai has broad research interests in methodology and theory in causal inference, reinforcement learning, and graphical modeling, to establish reliable, powerful, and interpretable solutions to real-world problems. Currently, her research focuses on causal inference and causal structure learning, and policy optimization and evaluation in reinforcement/deep learning. Her work has been published in conferences including ICLR, NeurIPS, ICML, and IJCAI, as well as journals including the Journal of Machine Learning Research, Stat, and Statistics in Medicine.
May 8
DBH 4011
1 pm

Pierre Baldi and Alexander Shmakov

Department of Computer Science, UC Irvine

The Baldi group will present ongoing progress in the theory and applications of deep learning. On the theory side, we will discuss homogeneous activation functions and their important connections to the concept of generalized neural balance. On the application side, we will present applications of neural transformers to physics, in particular for the assignment of observation measurements to the leaves of partial Feynman diagrams in particle physics. In these applications, the permutation invariance properties of transformers are used to capture fundamental symmetries (e.g. matter vs antimatter) in the laws of physics.

Bio: Pierre Baldi earned M.S. degrees in mathematics and psychology from the University of Paris, France, in 1980, and the Ph.D. degree in mathematics from the Caltech, CA, USA, in 1986. He is currently a Distinguished Professor with the Department of Computer Science, Director with the Institute for Genomics and Bioinformatics, and Associate Director with the Center for Machine Learning and Intelligent Systems at the University of California, Irvine, CA, USA. His research interests include understanding intelligence in brains and machines. He has made several contributions to the theory of deep learning, and developed and applied deep learning methods for problems in the natural sciences. He has written 4 books and over 300 peer-reviewed articles. Dr. Baldi was the recipient of the 1993 Lew Allen Award at JPL, the 2010 E. R. Caianiello Prize for research in machine learning, and a 2014 Google Faculty Research Award. He is an Elected Fellow of the AAAS, AAAI, IEEE, ACM, and ISCB Alexander Shmakov is a Ph.D. student in the Baldi research group who loves everything deep learning and robotics. He has published papers on applications of deep learning to planning, robotic control, high energy physics, astronomy, chemical synthesis, and biology.
May 15
DBH 4011
1 pm

Guy Van den Broeck

Associate Professor of Computer Science
University of California, Los Angeles

Many expect that AI will go from powering chatbots to providing mental health services. That it will go from advertisement to deciding who is given bail. The expectation is that AI will solve society’s problems by simply being more intelligent than we are. Implicit in this bullish perspective is the assumption that AI will naturally learn to reason from data: that it can form trains of thought that “make sense”, similar to how a mental health professional or judge might reason about a case, or more formally, how a mathematician might prove a theorem. This talk will investigate the question whether this behavior can be learned from data, and how we can design the next generation of AI techniques that can achieve such capabilities, focusing on neuro-symbolic learning and tractable deep generative models.

Bio: Guy Van den Broeck is an Associate Professor and Samueli Fellow at UCLA, in the Computer Science Department, where he directs the StarAI lab. His research interests are in Machine Learning, Knowledge Representation and Reasoning, and Artificial Intelligence in general. His papers have been recognized with awards from key conferences such as AAAI, UAI, KR, OOPSLA, and ILP. Guy is the recipient of an NSF CAREER award, a Sloan Fellowship, and the IJCAI-19 Computers and Thought Award.
May 22
DBH 4011
1 pm

Gabe Hope

PhD Student, Computer Science
University of California, Irvine

Variational autoencoders (VAEs) have proven to be an effective approach to modeling complex data distributions while providing compact representations that can be interpretable and useful for downstream prediction tasks. In this work we train variational autoencoders with the dual goals of good likelihood-based generative modeling and good discriminative performance in supervised and semi-supervised prediction tasks. We show that the dominant approach to training semi-supervised VAEs has key weaknesses: it is fragile as model capacity increases; it is slow due to marginalization over labels; and it incoherently decouples into separate discriminative and generative models when all data is labeled. Our novel framework for semi-supervised VAE training uses a more coherent architecture and an objective that maximizes generative likelihood subject to prediction quality constraints. To handle cases when labels are very sparse, we further enforce a consistency constraint, derived naturally from the generative model, that requires predictions on reconstructed data to match those on the original data. Our approach enables advances in generative modeling to be incorporated by semi-supervised classifiers, which we demonstrate by augmenting deep generative models with latent variables corresponding to spatial transformations and by introducing a “very deep'” prediction-constrained VAE with many layers of latent variables. Our experiments show that prediction and consistency constraints improve generative samples as well as image classification performance in semi-supervised settings.

Bio: Gabe Hope is a final-year PhD student at UC Irvine working with professor Erik Sudderth. His research focuses on deep generative models, interpretable machine learning and semi-supervised learning. This fall he will join the faculty at Harvey Mudd College as a visiting assistant professor in computer science.
May 29
No Seminar (Memorial Day)
June 5
DBH 4011
1 pm

Sangeetha Abdu Jyothi

Assistant Professor of Computer Science
University of California, Irvine

Lack of explainability is a key factor limiting the practical adoption of high-performant Deep Reinforcement Learning (DRL) controllers in systems environments. Explainable RL for networking hitherto used salient input features to interpret a controller’s behavior. However, these feature-based solutions do not completely explain the controller’s decision-making process. Often, operators are interested in understanding the impact of a controller’s actions on performance in the future, which feature-based solutions cannot capture. In this talk, I will present CrystalBox, a framework that explains a controller’s behavior in terms of the future impact on key network performance metrics. CrystalBox employs a novel learning-based approach to generate succinct and expressive explanations. We use reward components of the DRL network controller, which are key performance metrics meaningful to operators, as the basis for explanations. I will finally present three practical use cases of CrystalBox: cross-state explainability, guided reward design, and network observability.

Bio: Sangeetha Abdu Jyothi is an Assistant Professor in the Computer Science department at the University of California, Irvine. Her research interests lie at the intersection of computer systems, networking, and machine learning. Prior to UCI, she completed her Ph.D. at the University of Illinois, Urbana-Champaign in 2019 where she was advised by Brighten Godfrey and had a brief stint as a postdoc at VMware Research. She is currently an Affiliated Researcher at VMware Research. She leads the Networking, Systems, and AI Lab (NetSAIL) at UCI. Her current research focus revolves around: Internet and Cloud Resilience, and Systems and Machine Learning.
July 20
DBH 3011
11 am

Vincent Fortuin

Research group leader in Machine Learning
Helmholtz AI

Many researchers have pondered the same existential questions since the release of ChatGPT: Is scale really all you need? Will the future of machine learning rely exclusively on foundation models? Should we all drop our current research agenda and work on the next large language model instead? In this talk, I will try to make the case that the answer to all these questions should be a convinced “no” and that now, maybe more than ever, should be the time to focus on fundamental questions in machine learning again. I will provide evidence for this by presenting three modern use cases of Bayesian deep learning in the areas of self-supervised learning, interpretable additive modeling, and sequential decision making. Together, these will show that the research field of Bayesian deep learning is very much alive and thriving and that its potential for valuable real-world impact is only just unfolding.

Bio: Vincent Fortuin is a tenure-track research group leader at Helmholtz AI in Munich, leading the group for Efficient Learning and Probabilistic Inference for Science (ELPIS). He is also a Branco Weiss Fellow. His research focuses on reliable and data-efficient AI approaches leveraging Bayesian deep learning, deep generative modeling, meta-learning, and PAC-Bayesian theory. Before that, he did his PhD in Machine Learning at ETH Zürich and was a Research Fellow at the University of Cambridge. He is a member of ELLIS, a regular reviewer for all major machine learning conferences, and a co-organizer of the Symposium on Advances in Approximate Bayesian Inference (AABI) and the ICBINB initiative.

Winter 2023

Standard
Jan. 30
DBH 4011
1 pm

Maarten Bos

Lead Research Scientist
Snap Research

Corporate research labs aim to push the scientific and technological forefront of innovation outside traditional academia. Snap Inc. combines academia and industry by hiring academic researchers and doing application-driven research. In this talk I will give examples of research projects from my corporate research experience. My goal is to showcase the value of – and hurdles for – working both with and within corporate research labs, and how some of these values and hurdles are different from working in traditional academia.

Bio: Maarten Bos is a Lead Research Scientist at Snap Inc. After receiving his PhD in The Netherlands and postdoctoral training at Harvard University, he led a behavioral science group at Disney Research before joining Snap in 2018. His research interests range from decision science, to persuasion, and human-technology interaction. His work has been published in journals such as Science, Psychological Science, and the Journal of Marketing Research, and has been covered by the Wall Street Journal, Harvard Business Review, and The New York Times.
Feb. 6
DBH 4011
1 pm

Kolby Nottingham

PhD Student, Department of Computer Science
University of California, Irvine

While it’s common for other machine learning modalities to benefit from model pretraining, reinforcement learning (RL) agents still typically learn tabula rasa. Large language models (LLMs), trained on internet text, have been used as external knowledge sources for RL, but, on their own, they are noisy and lack the grounding necessary to reason in interactive environments. In this talk, we will cover methods for grounding LLMs in environment dynamics and applying extracted knowledge to training RL agents. Finally, we will demonstrate our newly proposed method for applying LLMs to improving RL sample efficiency through guided exploration. By applying LLMs to guiding exploration rather than using them as planners at execution time, our method remains robust to errors in LLM output while also grounding LLM knowledge in environment dynamics.

Bio: Kolby Nottingham is a PhD student at the University of California Irvine where he is coadvized by Professors Roy Fox and Sameer Singh. Kolby’s research interests lie at the intersection of reinforcement learning and natural language processing. His research applies recent advances in large language models to improving sequential decision making techniques.
Feb. 13
DBH 4011
1 pm

Noble Kennamer

PhD Student, Department of Computer Science
University of California, Irvine

Bayesian optimal experimental design is a sub-field of statistics focused on developing methods to make efficient use of experimental resources. Any potential design is evaluated in terms of a utility function, such as the (theoretically well-justified) expected information gain (EIG); unfortunately however, under most circumstances the EIG is intractable to evaluate. In this talk we build off of successful variational approaches, which optimize a parameterized variational model with respect to bounds on the EIG. Past work focused on learning a new variational model from scratch for each new design considered. Here we present a novel neural architecture that allows experimenters to optimize a single variational model that can estimate the EIG for potentially infinitely many designs. To further improve computational efficiency, we also propose to train the variational model on a significantly cheaper-to-evaluate lower bound, and show empirically that the resulting model provides an excellent guide for more accurate, but expensive to evaluate bounds on the EIG. We demonstrate the effectiveness of our technique on generalized linear models, a class of statistical models that is widely used in the analysis of controlled experiments. Experiments show that our method is able to greatly improve accuracy over existing approximation strategies, and achieve these results with far better sample efficiency.

Bio: Noble Kennamer recently completed his PhD at UC Irvine under Alexander Ihler, where he worked on variational methods for optimal experimental design and applications of machine learning to the physical sciences. In March he will be starting as a Research Scientist at Netflix.
Feb. 20
No Seminar (Presidents’ Day)
Feb. 27
Seminar Canceled
Mar. 6
DBH 4011
1 pm

Shlomo Zilberstein

Professor of Computer Science
University of Massachusetts, Amherst

Competence is the ability to do something well. Competence awareness is the ability to represent and learn a model of self competence and use it to decide how to best use the agent’s own abilities as well as any available human assistance. This capability is critical for the success and safety of autonomous systems that operate in the open world. In this talk, I introduce two types of competence-aware systems (CAS), namely Type I and Type II CAS. The former refers to a stand-alone system that can learn its own competence and use it to fine-tune itself to the characteristics of the problem instance at hand, without human assistance. The latter is a human-aware system that can uses a self-competence model to optimize the utilization of costly human assistive actions. I describe recent results that demonstrate the benefits of the two types of competence awareness in different contexts, including autonomous vehicle decision making.

Bio: Shlomo Zilberstein is Professor of Computer Science and Associate Dean for Research and Engagement in the Manning College of Information and Computer Sciences at the University of Massachusetts, Amherst. He received a B.A. in Computer Science from the Technion, and a Ph.D. in Computer Science from the UC Berkeley. Zilberstein’s research focuses on the foundations and applications of resource-bounded reasoning techniques, which allow complex systems to make decisions while coping with uncertainty, missing information, and limited computational resources. His research interests include decision theory, reasoning under uncertainty, Markov decision processes, design of autonomous agents, heuristic search, real-time problem solving, principles of meta-reasoning, planning and scheduling, multi-agent systems, and reinforcement learning. Zilberstein is a Fellow of AAAI and the ACM. He is recipient of the University of Massachusetts Chancellor’s Medal (2019), the IFAAMAS Influential Paper Award (2019), the AAAI Distinguished Service Award (2019), a National Science Foundation CAREER Award (1996), and the Israel Defense Prize (1992). He received numerous Paper Awards from AAAI (2017,2021), IJCAI (2020), AAMAS (2003), ECAI (1998), ICAPS (2010), and SoCS (2022) among others. He is the past Editor-in-Chief of the Journal of Artificial Intelligence Research, former Chair of the AAAI Conference Committee, former President of ICAPS, a former Councilor of AAAI, and the Chairman of the AI Access Foundation.

CML at NeurIPS 2022

Standard

Researchers associated with the UC Irvine Center for Machine Learning and Intelligent Systems published more than ten workshop and conference papers at the 2022 Conference on Neural Information Processing Systems. Highlights include an oral presentation by PhD students Alex Boyd and Sam Showalter, with Profs. Smyth and Mandt, on Predictive Querying for Autoregressive Neural Sequence Models; a tutorial by Yibo Yang and Stephan Mandt on Data Compression with Machine Learning; a talk by Prof. Pierre Baldi in the All Things Attention workshop; and papers in workshops on Deep Reinforcement Learning, Trustworthy and Socially Responsible Machine Learning, and Time Series for Health.

AI and ML Faculty Openings at UCI

Standard

The Department of Computer Science at the University of California, Irvine invites applications for tenure-track or tenured faculty positions beginning July 1, 2023. This faculty search targets applicants with research expertise in all aspects of artificial intelligence and machine learning, broadly interpreted. Candidates should follow the online application instructions for Recruit opening JPF07847, and submit materials by December 15, 2022 in order to receive full consideration.

Fall 2022

Standard
Oct. 10
DBH 4011
1 pm

Furong Huang

Assistant Professor of Computer Science
University of Maryland

With the burgeoning use of machine learning models in an assortment of applications, there is a need to rapidly and reliably deploy models in a variety of environments. These trustworthy machine learning models must satisfy certain criteria, namely the ability to: (i) adapt and generalize to previously unseen worlds although trained on data that only represent a subset of the world, (ii) allow for non-iid data, (iii) be resilient to (adversarial) perturbations, and (iv) conform to social norms and make ethical decisions. In this talk, towards trustworthy and generally applicable intelligent systems, I will cover some reinforcement learning algorithms that achieve fast adaptation by guaranteed knowledge transfer, principled methods that measure the vulnerability and improve the robustness of reinforcement learning agents, and ethical models that make fair decisions under distribution shifts.

Bio: Furong Huang is an Assistant Professor of the Department of Computer Science at University of Maryland. She works on statistical and trustworthy machine learning, reinforcement learning, graph neural networks, deep learning theory and federated learning with specialization in domain adaptation, algorithmic robustness and fairness. Furong is a recipient of the NSF CRII Award, the MLconf Industry Impact Research Award, the Adobe Faculty Research Award, and three JP Morgan Faculty Research Awards. She is a Finalist of AI in Research – AI researcher of the year for Women in AI Awards North America 2022. She received her Ph.D. in electrical engineering and computer science from UC Irvine in 2016, after which she completed postdoctoral positions at Microsoft Research NYC.
Oct. 17
DBH 4011
1 pm

Bodhi Majumder

PhD Student, Department of Computer Science and Engineering
University of California, San Diego

The use of artificial intelligence in knowledge-seeking applications (e.g., for recommendations and explanations) has shown remarkable effectiveness. However, the increasing demand for more interactions, accessibility and user-friendliness in these systems requires the underlying components (dialog models, LLMs) to be adequately grounded in the up-to-date real-world context. However, in reality, even powerful generative models often lack commonsense, explanations, and subjectivity — a long-standing goal of artificial general intelligence. In this talk, I will partly address these problems in three parts and hint at future possibilities and social impacts. Mainly, I will discuss: 1) methods to effectively inject up-to-date knowledge in an existing dialog model without any additional training, 2) the role of background knowledge in generating faithful natural language explanations, and 3) a conversational framework to address subjectivity—balancing task performance and bias mitigation for fair interpretable predictions.

Bio: Bodhisattwa Prasad Majumder is a final-year PhD student at CSE, UC San Diego, advised by Prof. Julian McAuley. His research goal is to build interactive machines capable of producing knowledge grounded explanations. He previously interned at Allen Institute of AI, Google AI, Microsoft Research, FAIR (Meta AI) and collaborated with U of Oxford, U of British Columbia, and Alan Turing Institute. He is a recipient of the UCSD CSE Doctoral Award for Research (2022), Adobe Research Fellowship (2022), UCSD Friends Fellowship (2022), and Qualcomm Innovation Fellowship (2020). In 2019, Bodhi led UCSD in the finals of Amazon Alexa Prize. He also co-authored a best-selling NLP book with O’Reilly Media that is being adopted in universities internationally. Website: http://www.majumderb.com/.
Oct. 24
DBH 4011
1 pm

Mark Steyvers

Professor of Cognitive Sciences
University of California, Irvine

Artificial intelligence (AI) and machine learning models are being increasingly deployed in real-world applications. In many of these applications, there is strong motivation to develop hybrid systems in which humans and AI algorithms can work together, leveraging their complementary strengths and weaknesses. In the first part of the presentation, I will discuss results from a Bayesian framework where we statistically combine the predictions from humans and machines while taking into account the unique ways human and algorithmic confidence is expressed. The framework allows us to investigate the factors that influence complementarity, where a hybrid combination of human and machine predictions leads to better performance than combinations of human or machine predictions alone. In the second part of the presentation, I will discuss some recent work on AI-assisted decision making where individuals are presented with recommended predictions from classifiers. Using a cognitive modeling approach, we can estimate the AI reliance policy used by individual participants. The results show that AI advice is more readily adopted if the individual is in a low confidence state, receives high-confidence advice from the AI and when the AI is generally more accurate. In the final part of the presentation, I will discuss the question of “machine theory of mind” and “theory of machine”, how humans and machines can efficiently form mental models of each other. I will show some recent results on theory-of-mind experiments where the goal is for individuals and machine algorithms to predict the performance of other individuals in image classification tasks. The results show performance gaps where human individuals outperform algorithms in mindreading tasks. I will discuss several research directions designed to close the gap.

Bio: Mark Steyvers is a Professor of Cognitive Science at UC Irvine and Chancellor’s Fellow. He has a joint appointment with the Computer Science department and is affiliated with the Center for Machine Learning and Intelligent Systems. His publications span work in cognitive science as well as machine learning and has been funded by NSF, NIH, IARPA, NAVY, and AFOSR. He received his PhD from Indiana University and was a Postdoctoral Fellow at Stanford University. He is currently serving as Associate Editor of Computational Brain and Behavior and Consulting Editor for Psychological Review and has previously served as the President of the Society of Mathematical Psychology, Associate Editor for Psychonomic Bulletin & Review and the Journal of Mathematical Psychology. In addition, he has served as a consultant for a variety of companies such as eBay, Yahoo, Netflix, Merriam Webster, Rubicon and Gimbal on machine learning problems. Dr. Steyvers received New Investigator Awards from the American Psychological Association as well as the Society of Experimental Psychologists. He also received an award from the Future of Privacy Forum and Alfred P. Sloan Foundation for his collaborative work with Lumosity.
Oct. 31
DBH 4011
1 pm

Alex Boyd

PhD Student, Department of Statistics
University of California, Irvine

In reasoning about sequential events it is natural to pose probabilistic queries such as “when will event A occur next” or “what is the probability of A occurring before B”, with applications in areas such as user modeling, medicine, and finance. However, with machine learning shifting towards neural autoregressive models such as RNNs and transformers, probabilistic querying has been largely restricted to simple cases such as next-event prediction. This is in part due to the fact that future querying involves marginalization over large path spaces, which is not straightforward to do efficiently in such models. In this talk, we will describe a novel representation of querying for these discrete sequential models, as well as discuss various approximation and search techniques that can be utilized to help estimate these probabilistic queries. Lastly, we will briefly touch on ongoing work that has extended these techniques into sequential models for continuous time events.

Bio: Alex Boyd is a Statistics PhD candidate at UC Irvine, co-advised by Padhraic Smyth and Stephan Mandt. His work focuses on improving probabilistic methods, primarily for deep sequential models. He was selected in 2020 as a National Science Foundation Graduate Fellow.
Nov. 7
DBH 4011
1 pm

Yanning Shen

Assistant Professor of Electrical Engineering and Computer Science
University of California, Irvine

We live in an era of data deluge, where pervasive media collect massive amounts of data, often in a streaming fashion. Learning from these dynamic and large volumes of data is hence expected to bring significant science and engineering advances along with consequent improvements in quality of life. However, with the blessings come big challenges. The sheer volume of data makes it impossible to run analytics in batch form. Large-scale datasets are noisy, incomplete, and prone to outliers. As many sources continuously generate data in real-time, it is often impossible to store all of it. Thus, analytics must often be performed in real-time, without a chance to revisit past entries. In response to these challenges, this talk will first introduce an online scalable function approximation scheme that is suitable for various machine learning tasks. The novel approach adaptively learns and tracks the sought nonlinear function ‘on the fly’ with quantifiable performance guarantees, even in adversarial environments with unknown dynamics. Building on this robust and scalable function approximation framework, a scalable online learning approach with graph feedback will be outlined next for online learning with possibly related models. The effectiveness of the novel algorithms will be showcased in several real-world datasets.

Bio: Yanning Shen is an assistant professor with the EECS department at the University of California, Irvine. She received her Ph.D. degree from the University of Minnesota (UMN) in 2019. She was a finalist for the Best Student Paper Award at the 2017 IEEE International Workshop on Computational Advances in Multi-Sensor Adaptive Processing, and the 2017 Asilomar Conference on Signals, Systems, and Computers. She was selected as a Rising Star in EECS by Stanford University in 2017. She received the Microsoft Academic Grant Award for AI Research in 2021, the Google Research Scholar Award in 2022, and the Hellman Fellowship in 2022. Her research interests span the areas of machine learning, network science, data science, and signal processing.
Nov. 14
DBH 4011
1 pm

Muhao Chen

Assistant Research Professor of Computer Science
University of Southern California

Information extraction (IE) is the process of automatically inducing structures of concepts and relations described in natural language text. It is the fundamental task to assess the machine’s ability for natural language understanding, as well as the essential step for acquiring structural knowledge representation that is integral to any knowledge-driven AI systems. Despite the importance, obtaining direct supervision for IE tasks is always very difficult, as it requires expert annotators to read through long documents and identify complex structures. Therefore, a robust and accountable IE model has to be achievable with minimal and imperfect supervision. Towards this mission, this talk covers recent advances of machine learning and inference technologies that (i) grant robustness against noise and perturbation, (ii) prevent systematic errors caused by spurious correlations, and (iii) provide indirect supervision for label-efficient and logically consistent IE.

Bio: Muhao Chen is an Assistant Research Professor of Computer Science at USC, and the director of the USC Language Understanding and Knowledge Acquisition (LUKA) Lab. His research focuses on robust and minimally supervised machine learning for natural language understanding, structured data processing, and knowledge acquisition from unstructured data. His work has been recognized with an NSF CRII Award, faculty research awards from Cisco and Amazon, an ACM SIGBio Best Student Paper Award and a best paper nomination at CoNLL. Dr. Chen obtained his Ph.D. degree from UCLA Department of Computer Science in 2019, and was a postdoctoral researcher at UPenn prior to joining USC.
Nov. 21
DBH 4011
1 pm

Peter Orbanz

Professor of Machine Learning
Gatsby Computational Neuroscience Unit, University College London

Consider a large random structure — a random graph, a stochastic process on the line, a random field on the grid — and a function that depends only on a small part of the structure. Now use a family of transformations to ‘move’ the domain of the function over the structure, collect each function value, and average. Under suitable conditions, the law of large numbers generalizes to such averages; that is one of the deep insights of modern ergodic theory. My own recent work with Morgane Austern (Harvard) shows that central limit theorems and other higher-order properties also hold. Loosely speaking, if the i.i.d. assumption of classical statistics is substituted by suitable properties formulated in terms of groups, the fundamental theorems of inference still hold.

Bio: Peter Orbanz is a Professor of Machine Learning in the Gatsby Computational Neuroscience Unit at University College London. He studies large systems of dependent variables in machine learning and inference problems. That involves symmetry and group invariance properties, such as exchangeability and stationarity, random graphs and random structures, hierarchies of latent variables, and the intersection of ergodic theory and statistical physics with statistics and machine learning. In the past, Peter was a PhD student of Joachim M. Buhmann at ETH Zurich, a postdoc with Zoubin Ghahramani at the University of Cambridge, and Assistant and Associate Professor in the Department of Statistics at Columbia University.
Nov. 28
No Seminar (NeurIPS Conference)

CML Researchers win NAACL Paper Award

Standard

Congratulations to CML PhD student Robert Logan, and his advisor Prof. Sameer Singh, who received a Best New Task Paper Award at the 2022 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL). Their method, FRUIT: Faithfully Reflecting Updated Information in Text, uses language models to automatically update articles (like those on Wikipedia) when new evidence is obtained. This work is motivated not only by a desire to assist the volunteers who maintain Wikipedia, but by the ways it pushes the boundaries of the NLP field.