The Department of Computer Science at the University of California, Irvine invites applications for tenure-track or tenured faculty positions beginning July 1, 2023. This faculty search targets applicants with research expertise in all aspects of artificial intelligence and machine learning, broadly interpreted. Candidates should follow the online application instructions for Recruit opening JPF07847, and submit materials by December 15, 2022 in order to receive full consideration.
Fall 2022
StandardOct. 10 DBH 4011 1 pm |
With the burgeoning use of machine learning models in an assortment of applications, there is a need to rapidly and reliably deploy models in a variety of environments. These trustworthy machine learning models must satisfy certain criteria, namely the ability to: (i) adapt and generalize to previously unseen worlds although trained on data that only represent a subset of the world, (ii) allow for non-iid data, (iii) be resilient to (adversarial) perturbations, and (iv) conform to social norms and make ethical decisions. In this talk, towards trustworthy and generally applicable intelligent systems, I will cover some reinforcement learning algorithms that achieve fast adaptation by guaranteed knowledge transfer, principled methods that measure the vulnerability and improve the robustness of reinforcement learning agents, and ethical models that make fair decisions under distribution shifts.
Bio: Furong Huang is an Assistant Professor of the Department of Computer Science at University of Maryland. She works on statistical and trustworthy machine learning, reinforcement learning, graph neural networks, deep learning theory and federated learning with specialization in domain adaptation, algorithmic robustness and fairness. Furong is a recipient of the NSF CRII Award, the MLconf Industry Impact Research Award, the Adobe Faculty Research Award, and three JP Morgan Faculty Research Awards. She is a Finalist of AI in Research – AI researcher of the year for Women in AI Awards North America 2022. She received her Ph.D. in electrical engineering and computer science from UC Irvine in 2016, after which she completed postdoctoral positions at Microsoft Research NYC. |
Oct. 17 DBH 4011 1 pm |
Bodhi Majumder PhD Student, Department of Computer Science and Engineering University of California, San Diego The use of artificial intelligence in knowledge-seeking applications (e.g., for recommendations and explanations) has shown remarkable effectiveness. However, the increasing demand for more interactions, accessibility and user-friendliness in these systems requires the underlying components (dialog models, LLMs) to be adequately grounded in the up-to-date real-world context. However, in reality, even powerful generative models often lack commonsense, explanations, and subjectivity — a long-standing goal of artificial general intelligence. In this talk, I will partly address these problems in three parts and hint at future possibilities and social impacts. Mainly, I will discuss: 1) methods to effectively inject up-to-date knowledge in an existing dialog model without any additional training, 2) the role of background knowledge in generating faithful natural language explanations, and 3) a conversational framework to address subjectivity—balancing task performance and bias mitigation for fair interpretable predictions.
Bio: Bodhisattwa Prasad Majumder is a final-year PhD student at CSE, UC San Diego, advised by Prof. Julian McAuley. His research goal is to build interactive machines capable of producing knowledge grounded explanations. He previously interned at Allen Institute of AI, Google AI, Microsoft Research, FAIR (Meta AI) and collaborated with U of Oxford, U of British Columbia, and Alan Turing Institute. He is a recipient of the UCSD CSE Doctoral Award for Research (2022), Adobe Research Fellowship (2022), UCSD Friends Fellowship (2022), and Qualcomm Innovation Fellowship (2020). In 2019, Bodhi led UCSD in the finals of Amazon Alexa Prize. He also co-authored a best-selling NLP book with O’Reilly Media that is being adopted in universities internationally. Website: http://www.majumderb.com/. |
Oct. 24 DBH 4011 1 pm |
Artificial intelligence (AI) and machine learning models are being increasingly deployed in real-world applications. In many of these applications, there is strong motivation to develop hybrid systems in which humans and AI algorithms can work together, leveraging their complementary strengths and weaknesses. In the first part of the presentation, I will discuss results from a Bayesian framework where we statistically combine the predictions from humans and machines while taking into account the unique ways human and algorithmic confidence is expressed. The framework allows us to investigate the factors that influence complementarity, where a hybrid combination of human and machine predictions leads to better performance than combinations of human or machine predictions alone. In the second part of the presentation, I will discuss some recent work on AI-assisted decision making where individuals are presented with recommended predictions from classifiers. Using a cognitive modeling approach, we can estimate the AI reliance policy used by individual participants. The results show that AI advice is more readily adopted if the individual is in a low confidence state, receives high-confidence advice from the AI and when the AI is generally more accurate. In the final part of the presentation, I will discuss the question of “machine theory of mind” and “theory of machine”, how humans and machines can efficiently form mental models of each other. I will show some recent results on theory-of-mind experiments where the goal is for individuals and machine algorithms to predict the performance of other individuals in image classification tasks. The results show performance gaps where human individuals outperform algorithms in mindreading tasks. I will discuss several research directions designed to close the gap.
Bio: Mark Steyvers is a Professor of Cognitive Science at UC Irvine and Chancellor’s Fellow. He has a joint appointment with the Computer Science department and is affiliated with the Center for Machine Learning and Intelligent Systems. His publications span work in cognitive science as well as machine learning and has been funded by NSF, NIH, IARPA, NAVY, and AFOSR. He received his PhD from Indiana University and was a Postdoctoral Fellow at Stanford University. He is currently serving as Associate Editor of Computational Brain and Behavior and Consulting Editor for Psychological Review and has previously served as the President of the Society of Mathematical Psychology, Associate Editor for Psychonomic Bulletin & Review and the Journal of Mathematical Psychology. In addition, he has served as a consultant for a variety of companies such as eBay, Yahoo, Netflix, Merriam Webster, Rubicon and Gimbal on machine learning problems. Dr. Steyvers received New Investigator Awards from the American Psychological Association as well as the Society of Experimental Psychologists. He also received an award from the Future of Privacy Forum and Alfred P. Sloan Foundation for his collaborative work with Lumosity. |
Oct. 31 DBH 4011 1 pm |
In reasoning about sequential events it is natural to pose probabilistic queries such as “when will event A occur next” or “what is the probability of A occurring before B”, with applications in areas such as user modeling, medicine, and finance. However, with machine learning shifting towards neural autoregressive models such as RNNs and transformers, probabilistic querying has been largely restricted to simple cases such as next-event prediction. This is in part due to the fact that future querying involves marginalization over large path spaces, which is not straightforward to do efficiently in such models. In this talk, we will describe a novel representation of querying for these discrete sequential models, as well as discuss various approximation and search techniques that can be utilized to help estimate these probabilistic queries. Lastly, we will briefly touch on ongoing work that has extended these techniques into sequential models for continuous time events.
Bio: Alex Boyd is a Statistics PhD candidate at UC Irvine, co-advised by Padhraic Smyth and Stephan Mandt. His work focuses on improving probabilistic methods, primarily for deep sequential models. He was selected in 2020 as a National Science Foundation Graduate Fellow. |
Nov. 7 DBH 4011 1 pm |
Yanning Shen Assistant Professor of Electrical Engineering and Computer Science University of California, Irvine We live in an era of data deluge, where pervasive media collect massive amounts of data, often in a streaming fashion. Learning from these dynamic and large volumes of data is hence expected to bring significant science and engineering advances along with consequent improvements in quality of life. However, with the blessings come big challenges. The sheer volume of data makes it impossible to run analytics in batch form. Large-scale datasets are noisy, incomplete, and prone to outliers. As many sources continuously generate data in real-time, it is often impossible to store all of it. Thus, analytics must often be performed in real-time, without a chance to revisit past entries. In response to these challenges, this talk will first introduce an online scalable function approximation scheme that is suitable for various machine learning tasks. The novel approach adaptively learns and tracks the sought nonlinear function ‘on the fly’ with quantifiable performance guarantees, even in adversarial environments with unknown dynamics. Building on this robust and scalable function approximation framework, a scalable online learning approach with graph feedback will be outlined next for online learning with possibly related models. The effectiveness of the novel algorithms will be showcased in several real-world datasets.
Bio: Yanning Shen is an assistant professor with the EECS department at the University of California, Irvine. She received her Ph.D. degree from the University of Minnesota (UMN) in 2019. She was a finalist for the Best Student Paper Award at the 2017 IEEE International Workshop on Computational Advances in Multi-Sensor Adaptive Processing, and the 2017 Asilomar Conference on Signals, Systems, and Computers. She was selected as a Rising Star in EECS by Stanford University in 2017. She received the Microsoft Academic Grant Award for AI Research in 2021, the Google Research Scholar Award in 2022, and the Hellman Fellowship in 2022. Her research interests span the areas of machine learning, network science, data science, and signal processing. |
Nov. 14 DBH 4011 1 pm |
Information extraction (IE) is the process of automatically inducing structures of concepts and relations described in natural language text. It is the fundamental task to assess the machine’s ability for natural language understanding, as well as the essential step for acquiring structural knowledge representation that is integral to any knowledge-driven AI systems. Despite the importance, obtaining direct supervision for IE tasks is always very difficult, as it requires expert annotators to read through long documents and identify complex structures. Therefore, a robust and accountable IE model has to be achievable with minimal and imperfect supervision. Towards this mission, this talk covers recent advances of machine learning and inference technologies that (i) grant robustness against noise and perturbation, (ii) prevent systematic errors caused by spurious correlations, and (iii) provide indirect supervision for label-efficient and logically consistent IE.
Bio: Muhao Chen is an Assistant Research Professor of Computer Science at USC, and the director of the USC Language Understanding and Knowledge Acquisition (LUKA) Lab. His research focuses on robust and minimally supervised machine learning for natural language understanding, structured data processing, and knowledge acquisition from unstructured data. His work has been recognized with an NSF CRII Award, faculty research awards from Cisco and Amazon, an ACM SIGBio Best Student Paper Award and a best paper nomination at CoNLL. Dr. Chen obtained his Ph.D. degree from UCLA Department of Computer Science in 2019, and was a postdoctoral researcher at UPenn prior to joining USC. |
Nov. 21 DBH 4011 1 pm |
Peter Orbanz Professor of Machine Learning Gatsby Computational Neuroscience Unit, University College London Consider a large random structure — a random graph, a stochastic process on the line, a random field on the grid — and a function that depends only on a small part of the structure. Now use a family of transformations to ‘move’ the domain of the function over the structure, collect each function value, and average. Under suitable conditions, the law of large numbers generalizes to such averages; that is one of the deep insights of modern ergodic theory. My own recent work with Morgane Austern (Harvard) shows that central limit theorems and other higher-order properties also hold. Loosely speaking, if the i.i.d. assumption of classical statistics is substituted by suitable properties formulated in terms of groups, the fundamental theorems of inference still hold.
Bio: Peter Orbanz is a Professor of Machine Learning in the Gatsby Computational Neuroscience Unit at University College London. He studies large systems of dependent variables in machine learning and inference problems. That involves symmetry and group invariance properties, such as exchangeability and stationarity, random graphs and random structures, hierarchies of latent variables, and the intersection of ergodic theory and statistical physics with statistics and machine learning. In the past, Peter was a PhD student of Joachim M. Buhmann at ETH Zurich, a postdoc with Zoubin Ghahramani at the University of Cambridge, and Assistant and Associate Professor in the Department of Statistics at Columbia University. |
Nov. 28 |
No Seminar (NeurIPS Conference)
|
Jyothi Named a Rising Star by N2Women
StandardCongratulations to Sangeetha Abdu Jyothi for being named a 2022 Rising Star in Computer Networking and Communications by N2Women. Prof. Jyothi explores innovative applications of machine learning to systems and networking problems, including award-winning recent work characterizing the resilience of the internet to solar superstorms.
CML Researchers win NAACL Paper Award
StandardCongratulations to CML PhD student Robert Logan, and his advisor Prof. Sameer Singh, who received a Best New Task Paper Award at the 2022 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL). Their method, FRUIT: Faithfully Reflecting Updated Information in Text, uses language models to automatically update articles (like those on Wikipedia) when new evidence is obtained. This work is motivated not only by a desire to assist the volunteers who maintain Wikipedia, but by the ways it pushes the boundaries of the NLP field.
Spring 2022
StandardLive Stream for all Spring 2022 CML Seminars
Maurizio Filippone Associate Professor, EURECOM and Ba-Hien Tran PhD Student, EURECOM YouTube Stream: https://youtu.be/oZAuh686ipw The Bayesian treatment of neural networks dictates that a prior distribution is specified over their weight and bias parameters. This poses a challenge because modern neural networks are characterized by a huge number of parameters and non-linearities. The choice of these priors has an unpredictable effect on the distribution of the functional output which could represent a hugely limiting aspect of Bayesian deep learning models. Differently, Gaussian processes offer a rigorous non-parametric framework to define prior distributions over the space of functions. In this talk, we aim to introduce a novel and robust framework to impose such functional priors on modern neural networks for supervised learning tasks through minimizing the Wasserstein distance between samples of stochastic processes. In addition, we extend this framework to carry out model selection for Bayesian autoencoders for unsupervised learning tasks. We provide extensive experimental evidence that coupling these priors with scalable Markov chain Monte Carlo sampling offers systematically large performance improvements over alternative choices of priors and state-of-the-art approximate Bayesian deep learning approaches.
Bio: Maurizio Filippone received a Master’s degree in Physics and a Ph.D. in Computer Science from the University of Genova, Italy, in 2004 and 2008, respectively. In 2007, he was a Research Scholar with George Mason University, Fairfax, VA. From 2008 to 2011, he was a Research Associate with the University of Sheffield, U.K. (2008-2009), with the University of Glasgow, U.K. (2010), and with University College London, U.K (2011). From 2011 to 2015 he was a Lecturer at the University of Glasgow, U.K, and he is currently AXA Chair of Computational Statistics and Associate Professor at EURECOM, Sophia Antipolis, France. His current research interests include the development of tractable and scalable Bayesian inference techniques for Gaussian processes and Deep/Conv Nets with applications in life and environmental sciences. Bio: Ba-Hien Tran is currently a PhD student within the Data Science department of EURECOM, under the supervision of Professor Maurizio Filippone. His research focuses on Accelerating Inference for Deep Probabilistic Modeling. In 2016, he received a Bachelor of Science degree with honors in Computer Science from Vietnam National University, HCMC. His thesis investigated Deep Learning approaches for data-driven image captioning. In 2020, he received a Master of Science in Engineering degree in Data Science from Télécom Paris. His thesis focused on Bayesian Inference for Deep Neural Networks. |
|
Ties van Rozendaal Senior Machine Learning Researcher Qualcomm AI Research YouTube Stream: https://youtu.be/LQu-kwpfFg4 Neural data compression has been shown to outperform classical methods in terms of rate-distortion performance, with results still improving rapidly. These models are fitted to a training dataset and cannot be expected to optimally compress test data in general due to limitations on model capacity, distribution shifts, and imperfect optimization. If the test-time data distribution is known and has relatively low entropy, the model can easily be finetuned or adapted to this distribution. Instance-adaptive methods take this approach to the extreme, adapting the model to a single test instance, and signaling the updated model along in the bitstream. In this talk, we will show the potential of different types of instance-adaptive methods and discuss the tradeoffs that these methods pose.
Bio: Ties is a senior machine learning researcher at Qualcomm AI Research. He obtained his masters’s degree at the University of Amsterdam with a thesis on personalizing automatic speech recognition systems using unsupervised methods. At Qualcomm AI research he has been working on neural compression, with a focus on using generative models to compress image and video data. His research includes work on semantic compression and constrained optimization as well as instance-adaptive and neural-implicit compression. |
|
Robin Jia Assistant Professor of Computer Science University of Southern California YouTube Stream: https://youtu.be/ALqqlgbzAB0 Natural language processing (NLP) models have achieved impressive accuracies on in-distribution benchmarks, but they are unreliable in out-of-distribution (OOD) settings. In this talk, I will give an exclusive preview of my group’s ongoing work on evaluating and improving model performance in OOD settings. First, I will propose likelihood splits, a general-purpose way to create challenging non-i.i.d. benchmarks by measuring generalization to the tail of the data distribution, as identified by a language model. Second, I will describe the advantages of neurosymbolic approaches over end-to-end pretrained models for OOD generalization in visual question answering; these results highlight the importance of measuring OOD generalization when comparing modeling approaches. Finally, I will show how synthesized examples can improve open-set recognition, the task of abstaining on OOD examples that come from classes never seen at training time.
Bio: Robin Jia is an Assistant Professor of Computer Science at the University of Southern California. He received his Ph.D. in Computer Science from Stanford University, where he was advised by Percy Liang. He has also spent time as a visiting researcher at Facebook AI Research, working with Luke Zettlemoyer and Douwe Kiela. He is interested broadly in natural language processing and machine learning, with a particular focus on building NLP systems that are robust to distribution shift. Robin’s work has received best paper awards at ACL and EMNLP. |
|
May 23 |
No Seminar
|
May 30 |
No Seminar (Memorial Day Holiday)
|
Bobak Pezeshki PhD Student, Department of Computer Science University of California, Irvine YouTube Stream: https://youtu.be/Yl_aCTieVqc Computational protein design (CPD) is the task of creating new proteins to fulfill a desired function. In this talk, I will share work recently accepted at UAI 2022 based on a new formulation of CPD as a graphical model designed for optimizing subunit binding affinity. These new methods showed promising results when compared with state-of-the-art algorithm BBK* that is part of a long-time developed software package dedicated to CPD. In the talk, I will first describe CPD in general and for optimizing a quantity called K* (which approximates binding affinity). I will relate this to the well known task of MMAP for which many powerful algorithms have been recently developed and from which our methods are inspired. Next I will give a preview of the promising results of our new framework. I will then go on to describe the framework, presenting the formulation of the problem as a graphical model for K* optimization and introducing a weighted mini-bucket heuristic for bounding K* and guiding search. Finally, I will share our algorithm AOBB-K* and modifications that can enhance it, describing some of the empirical benefits and limitations of our scheme. To conclude, I will outline some future directions for advancing the use of this framework.
Bio: Bobak Pezeshki is a fifth year PhD student of Computer Science at the University of California, Irvine, under advisement of Professor Rina Dechter. His research focus is in automated reasoning over graphical models with focus in Abstraction Sampling and applying automated reasoning over graphical models to computational protein design. He completed his undergraduate studies at UC Berkeley majoring in Molecular and Cell Biology (with an emphasis in Biochemistry) and Integrative Biology. Before pursuing his PhD at UCI, he was involved in protein biochemistry research at the Stroud Lab, UCSF, and at Novartis Vaccines and Diagnostics. |
CML faculty elected as AAAS Fellows
StandardTwo faculty affiliated with the UCI Center for Machine Learning and Intelligent Systems have been elected as 2021 AAAS Fellows, joining 190 other AAAS Fellows at UC Irvine. Rina Dechter, Distinguished Professor of Computer Science and Associate Dean for Research in the Donald Bren School of Information & Computer Sciences, was elected for contributions to computational aspects of automated reasoning and knowledge representation, including search, constraint processing, and probabilistic reasoning, and for service to the computing community. Padhraic Smyth, Chancellor’s Professor of Computer Science and Associate Director of the UCI Center for Machine Learning, was elected for distinguished contributions to the field of machine learning, particularly the development of statistical foundations and methodologies. Congratulations to them both!
Winter 2022
StandardLive Stream for all Winter 2022 CML Seminars
January 3 |
No Seminar
|
Roy Fox Assistant Professor Department of Computer Science University of California, Irvine YouTube Stream: https://youtu.be/ImvsK5CFp0w Ensemble methods for reinforcement learning have gained attention in recent years, due to their ability to represent model uncertainty and use it to guide exploration and to reduce value estimation bias. We present MeanQ, a very simple ensemble method with improved performance, and show how it reduces estimation variance enough to operate without a stabilizing target network. Curiously, MeanQ is theoretically *almost* equivalent to a non-ensemble state-of-the-art method that it significantly outperforms, raising questions about the interaction between uncertainty estimation, representation, and resampling.
In adversarial environments, where a second agent attempts to minimize the first’s rewards, double-oracle (DO) methods grow a population of policies for both agents by iteratively adding the best response to the current population. DO algorithms are guaranteed to converge when they exhaust all policies, but are only effective when they find a small population sufficient to induce a good agent. We present XDO, a DO algorithm that exploits the game’s sequential structure to exponentially reduce the worst-case population size. Curiously, the small population size that XDO needs to find good agents more than compensates for its increased difficulty to iterate with a given population size. Bio: Roy Fox is an Assistant Professor and director of the Intelligent Dynamics Lab at the Department of Computer Science at UCI. He was previously a postdoc in UC Berkeley’s BAIR, RISELab, and AUTOLAB, where he developed algorithms and systems that interact with humans to learn structured control policies for robotics and program synthesis. His research interests include theory and applications of reinforcement learning, algorithmic game theory, information theory, and robotics. His current research focuses on structure, exploration, and optimization in deep reinforcement learning and imitation learning of virtual and physical agents and multi-agent systems. |
|
January 17 |
No Seminar (Martin Luther King, Jr. Day)
|
Ransalu Senanayake Postdoctoral Scholar Department of Computer Science Stanford University YouTube Stream: https://youtu.be/3yR8BqBElXw Autonomous agents such as self-driving cars have already gained the capability to perform individual tasks such as object detection and lane following, especially in simple, static environments. While advancing robots towards full autonomy, it is important to minimize deleterious effects on humans and infrastructure to ensure the trustworthiness of such systems. However, for robots to safely operate in the real world, it is vital for them to quantify the multimodal aleatoric and epistemic uncertainty around them and use that uncertainty for decision-making. In this talk, I will talk about how we can leverage tools from approximate Bayesian inference, kernel methods, and deep neural networks to develop interpretable autonomous systems for high-stakes applications.
Bio: Ransalu Senanayake is a postdoctoral scholar in the Statistical Machine Learning Group at the Department of Computer Science, Stanford University. He focuses on making downstream applications of machine learning trustworthy by quantifying uncertainty and explaining the decisions of such systems. Currently, he works with Prof. Emily Fox and Prof. Carlos Guestrin. He also worked on decision-making under uncertainty with Prof. Mykel Kochenderfer. Prior to joining Stanford, Ransalu obtained a PhD in Computer Science from the University of Sydney, Australia, and an MPhil in Industrial Engineering and Decision Analytics from the Hong Kong University of Science and Technology, Hong Kong. |
|
Dylan Slack PhD Student Department of Computer Science University of California, Irvine YouTube Stream: https://youtu.be/71RJvjPhk3U For domain experts to adopt machine learning (ML) models in high-stakes settings such as health care and law, they must understand and trust model predictions. As a result, researchers have proposed numerous ways to explain the predictions of complex ML models. However, these approaches suffer from several critical drawbacks, such as vulnerability to adversarial attacks, instability, inconsistency, and lack of guidance about accuracy and correctness. For practitioners to safely use explanations in the real world, it is vital to properly characterize the limitations of current techniques and develop improved explainability methods. This talk will describe the shortcomings of explanations and introduce current research demonstrating how they are vulnerable to adversarial attacks. I will also discuss promising solutions and present recent work on explanations that leverage uncertainty estimates to overcome several critical explanation shortcomings.
Bio: Dylan Slack is a Ph.D. candidate at UC Irvine advised by Sameer Singh and Hima Lakkaraju and associated with UCI NLP, CREATE, and the HPI Research Center. His research focuses on developing techniques that help researchers and practitioners build more robust, reliable, and trustworthy machine learning models. In the past, he has held research internships at GoogleAI and Amazon AWS and was previously an undergraduate at Haverford College advised by Sorelle Friedler where he researched fairness in machine learning. |
|
Maja Rudolph Senior Research Scientist Bosch Center for AI YouTube Stream: https://youtu.be/9fRw74WhRdE Recurrent neural networks (RNNs) are a popular choice for modeling sequential data. Standard RNNs assume constant time-intervals between observations. However, in many datasets (e.g. medical records) observation times are irregular and can carry important information. To address this challenge, we propose continuous recurrent units (CRUs) – a neural architecture that can naturally handle irregular intervals between observations. The CRU assumes a hidden state which evolves according to a linear stochastic differential equation and is integrated into an encoder-decoder framework. The recursive computations of the CRU can be derived using the continuous-discrete Kalman filter and are in closed form. The resulting recurrent architecture has temporal continuity between hidden states and a gating mechanism that can optimally integrate noisy observations. We derive an efficient parametrization scheme for the CRU that leads to a fast implementation (f-CRU). We empirically study the CRU on a number of challenging datasets and find that it can interpolate irregular time series better than methods based on neural ordinary differential equations.
Bio: Maja Rudolph is a Senior Research Scientist at the Bosch Center for AI where she works on machine learning research questions derived from engineering problems: for example, how to model driving behavior, how to forecast the operating conditions of a device, or how to find anomalies in the sensor data of an assembly line. In 2018, Maja completed her Ph.D. in Computer Science at Columbia University, advised by David Blei. She holds a MS in Electrical Engineering from Columbia University and a BS in Mathematics from MIT. |
|
Energy-based models (EBMs) are an appealing class of probabilistic models, which can be viewed as generative versions of discriminators, yet can be learned from unlabeled data. Despite a number of desirable properties, two challenges remain for training EBMs on high-dimensional datasets. First, learning EBMs by maximum likelihood requires Markov Chain Monte Carlo (MCMC) to generate samples from the model, which can be extremely expensive. Second, the energy potentials learned with non-convergent MCMC can be highly biased, making it difficult to evaluate the learned energy potentials or apply the learned models to downstream tasks. In this talk, I will present two algorithms to tackle the challenges of training EBMs. (1) Diffusion Recovery Likelihood, where we tractably learn and sample from a sequence of EBMs trained on increasingly noisy versions of a dataset. Each EBM is trained with recovery likelihood, which maximizes the conditional probability of the data at a certain noise level given their noisy versions at a higher noise level. (2) Flow Contrastive Estimation, where we jointly estimate an EBM and a flow-based model, in which the two models are iteratively updated based on a shared adversarial value function. We demonstrate that EBMs can be trained with a small budget of MCMC or completely without MCMC. The learned energy potentials are faithful and can be applied to likelihood evaluation and downstream tasks, such as feature learning and semi-supervised learning. Bio: Ruiqi Gao is a research scientist at Google, Brain team. Her research interests are in statistical modeling and learning, with a focus on generative models and representation learning. She received her Ph.D. degree in statistics from the University of California, Los Angeles (UCLA) in 2021 advised by Song-Chun Zhu and Ying Nian Wu. Prior to that, she received her bachelor’s degree from Peking University. Her recent research themes include scalable training algorithms of deep generative models, variational inference, and representational models with implications in neuroscience. |
|
February 21 |
No Seminar (Presidents’ Day)
|
Large language models are commonly used in different paradigms of natural language processing and machine learning, and are known for their efficiency as well as their overall lack of interpretability. Their data driven approach for emulating human language often results in human biases being encoded and even amplified, potentially leading to cyclic propagation of representational and allocational harm. We discuss in this talk some aspects of detecting, evaluating, and mitigating biases and associated harms in a holistic, inclusive, and culturally-aware manner. In particular, we discuss the disparate impact on society of common language tools that are not inclusive of all gender identities.
Bio: Sunipa Dev is a Research Scientist on the Ethical AI team at Google AI. Previously, she was an NSF Computing Innovation Fellow at UCLA, before which she completed her PhD at the University of Utah. Her ongoing research focuses on various facets of fairness and interpretability in NLP, including robust measurements of bias, cross-cultural understanding of concepts in NLP, and inclusive language representations. |
|
March 7 Zoom 1 pm |
Mukund Sundararajan Principal Research Scientist YouTube Stream unavailable, please join via Zoom Predicting cancer from XRays seemed great Until we discovered the true reason. The model, in its glory, did fixate On radiologist markings – treason! We found the issue with attribution: By blaming pixels for the prediction (1,2,3,4,5,6). A complement’ry way to attribute, is to pay training data, a tribute (1). If you are int’rested in FTC, counterfactual theory, SGD Or Shapley values and fine kernel tricks, Please come attend, unless you have conflicts Should you build deep models down the road, Use attributions. Takes ten lines of code! Bio: There once was an RS called MS, The models he studies are a mess, A director at Google. Accurate and frugal, Explanations are what he likes best. |
March 14 |
No Seminar (Finals Week)
|
NSF CAREER Awards for Stephan Mandt and Sameer Singh
StandardCongratulations to each of Professors Stephan Mandt and Sameer Singh for recently being awarded prestigious CAREER awards for basic research from the National Science Foundation. Professor Mandt’s research will focus on a unified set of mathematical and statistical tools for resource-efficient deep learning, with expected applications to new methods for compressing both neural networks and their data (e.g., images and video), as well as new algorithms for faster training. Professor Singh will develop new techniques and methodologies to address vulnerabilities in current state-of-the-art natural language processing models based on deep learning by developing several techniques in support of more robust training and evaluation, with applications to automated methods for finding and detecting problems in such models, explaining them to users, and fixing them.
New Book from Pierre Baldi on the Sciences and Deep Learning
StandardProfessor Pierre Baldi has published a new text that bridges the gap between deep learning and the natural sciences. Titled Deep Learning in Science (Cambridge University Press, 2021), the text provides readers with a perspective that there is “a principled, foundational approach to machine learning” and readers “are made aware of the many interesting applications in natural sciences as opposed to just in engineering and commerce” (quoting Professor Baldi).
Spring 2021
StandardLive Stream for all Spring 2021 CML Seminars
March 29 |
No Seminar
|
April 5th |
No Seminar
|
Sanmi Koyejo Assistant Professor Department of Computer Science University of Illinois at Urbana-Champaign YouTube Stream: https://youtu.be/Ehqsp8vRLis Across healthcare, science, and engineering, we increasingly employ machine learning (ML) to automate decision-making that, in turn, affects our lives in profound ways. However, ML can fail, with significant and long-lasting consequences. Reliably measuring such failures is the first step towards building robust and trustworthy learning machines. Consider algorithmic fairness, where widely-deployed fairness metrics can exacerbate group disparities and result in discriminatory outcomes. Moreover, existing metrics are often incompatible. Hence, selecting fairness metrics is an open problem. Measurement is also crucial for robustness, particularly in federated learning with error-prone devices. Here, once again, models constructed using well-accepted robustness metrics can fail. Across ML applications, the dire consequences of mismeasurement are a recurring theme. This talk will outline emerging strategies for addressing the measurement gap in ML and how this impacts trustworthiness.
Bio: Sanmi (Oluwasanmi) Koyejo is an Assistant Professor in the Department of Computer Science at the University of Illinois at Urbana-Champaign. Koyejo’s research interests are in developing the principles and practice of trustworthy machine learning. Additionally, Koyejo focuses on applications to neuroscience and healthcare. Koyejo completed his Ph.D. in Electrical Engineering at the University of Texas at Austin, advised by Joydeep Ghosh, and completed postdoctoral research at Stanford University. His postdoctoral research was primarily with Russell A. Poldrack and Pradeep Ravikumar. Koyejo has been the recipient of several awards, including a best paper award from the conference on uncertainty in artificial intelligence (UAI), a Sloan Fellowship, a Kavli Fellowship, an IJCAI early career spotlight, and a trainee award from the Organization for Human Brain Mapping (OHBM). Koyejo serves on the board of the Black in AI organization. |
|
April 19th Sponsored by the Steckler Center for Responsible, Ethical, and Accessible Technology (CREATE) 4 pm (Note change in time) |
Kate Crawford Senior Principal Researcher, Microsoft Research, New York Distinguished Visiting Fellow at the University of Melbourne Where do the motivating ideas behind Artificial Intelligence come from and what do they imply? What claims to universality or particularity are made by AI systems? How do the movements of ideas, data, and materials shape the present and likely futures of AI development? Join us for a conversation with social scientist and AI scholar Kate Crawford about the intellectual history and geopolitical contexts of contemporary AI research and practice.
Bio: Kate Crawford is a leading scholar of the social and political implications of artificial intelligence. Over her 20-year career, her work has focused on understanding large-scale data systems, machine learning and AI in the wider contexts of history, politics, labor, and the environment. She is a Research Professor of Communication and STS at USC Annenberg, a Senior Principal Researcher at MSR-NYC, and the inaugural Visiting Chair for AI and Justice at the École Normale Supérieure in Paris, In 2021, she will be the Miegunyah Distinguished Visiting Fellow at the University of Melbourne, and has been appointed an Honorary Professor at the University of Sydney. She previously co-founded the AI Now Institute at New York University. Kate has advised policy makers in the United Nations, the Federal Trade Commission, the European Parliament, and the White House. Her academic research has been published in journals such as Nature, New Media & Society, Science, Technology & Human Values and Information, Communication & Society. Beyond academic journals, Kate has also written for The New York Times, The Atlantic, Harpers’ Magazine, among others. |
Yibo Yang PhD Student Department of Computer Science University of California, Irvine YouTube Stream: https://youtu.be/1lXKUhBTHWc Probabilistic machine learning, particularly deep learning, is reshaping the field of data compression. Recent work has established a close connection between lossy data compression and latent variable models such as variational autoencoders (VAEs), and VAEs are now the building blocks of many learning-based lossy compression algorithms that are trained on massive amounts of unlabeled data. In this talk, I give a brief overview of learned data compression, including the current paradigm of end-to-end lossy compression with VAEs, and present my research that addresses some of its limitations and explores other possibilities of learned data compression. First, I present algorithmic improvements inspired by variational inference that push the performance limits of VAE-based lossy compression, resulting in a new state-of-the-art performance on image compression. Then, I introduce a new algorithm that compresses the variational posteriors of pre-trained latent variable models, and allows for variable-bitrate lossy compression with a vanilla VAE. Lastly, I discuss ongoing work that explores fundamental bounds on the theoretical performance of lossy compression algorithms, using the tools of stochastic approximation and deep learning.
Bio: Yibo Yang is a PhD student advised by Stephan Mandt in the Computer Science department at UC Irvine. His research interests include probability theory, information theory, and their applications in statistical machine learning. |
|
Levi Lelis Assistant Professor Department of Computer Science University of Alberta YouTube Stream: https://youtu.be/76NFMs9pHEE In this talk I will describe two tree search algorithms that use a policy to guide the search. I will start with Levin tree search (LTS), a best-first search algorithm that has guarantees on the number of nodes it needs to expand to solve state-space search problems. These guarantees are based on the quality of the policy it employs. I will then describe Policy-Guided Heuristic Search (PHS), another best-first search algorithm that uses both a policy and a heuristic function to guide the search. PHS also has guarantees on the number of nodes it expands, which are based on the quality of the policy and of the heuristic function employed. I will then present empirical results showing that LTS and PHS compare favorably with A*, Weighted A*, Greedy Best-First Search, and PUCT on a set of single-agent shortest-path problems.
Bio: Levi Lelis is an Assistant Professor at the University of Alberta, Canada, and a Professor on leave from Universidade Federal de Viçosa, Brazil. Levi is interested in heuristic search, machine learning, and program synthesis. |
|
David Alvarez-Melis Postdoctoral Researcher Microsoft Research New England YouTube Stream: https://youtu.be/52bQ_XUY2DQ Abstract: Success stories in machine learning seem to be ubiquitous, but they tend to be concentrated on ‘ideal’ scenarios where clean labeled data are abundant, evaluation metrics are unambiguous, and operational constraints are rare — if at all existent. But machine learning in practice is rarely so ‘pristine’; clean data is often scarce, resources are limited, and constraints (e.g., privacy, transparency) abound in most real-life applications. In this talk we will explore how to reconcile these paradigms along two main axes: (i) learning with scarce or heterogeneous data, and (ii) making complex models, such as neural networks, interpretable.
First, I will present various approaches that I have developed for ‘amplifying’ (e.g, merging, transforming, interpolating) datasets based on the theory of Optimal Transport. Through applications in machine translation, transfer learning, and dataset shaping, I will show that besides enjoying sound theoretical footing, these approaches yield efficient and high-performing algorithms. In the second part of the talk, I will present some of my work on designing methods to extract ‘explanations’ from complex models and on imposing on them some basic formal notions that I argue any interpretability method should satisfy, but which most lack. Finally, I will present a novel framework for interpretable machine learning that takes inspiration from the study of (human) explanation in the social sciences, and whose evaluation through user studies yields insights about the promise (and limitations) of interpretable AI tools.
Bio: David Alvarez-Melis is a postdoctoral researcher in the Machine Learning and Statistics Group at Microsoft Research, New England. He recently obtained a Ph.D. in computer science from MIT advised by Tommi Jaakkola, and holds B.Sc. and M.S. degrees in mathematics from ITAM and Courant Institute (NYU), respectively. He has previously spent time at IBM Research and is a recipient of CONACYT, Hewlett Packard, and AI2 awards. |
|
Megan Peters Assistant Professor Department of Cognitive Sciences UC Irvine YouTube Stream: https://youtu.be/i9Cenn0stxE Abstract: TBA
Bio: In March 2020 I joined the UCI Department of Cognitive Sciences. I’m also a Cooperating Researcher in the Department of Decoded Neurofeedback at Advanced Telecommunications Research Institute International in Kyoto, Japan. Prior to that, from 2017 I was on the faculty at UC Riverside in the Department of Bioengineering. I received my Ph.D. in computational cognitive neuroscience (psychology) from UCLA, and then was a postdoc there as well. My research aims to reveal how the brain represents and uses uncertainty, and performs adaptive computations based on noisy, incomplete information. I specifically focus on how these abilities support metacognitive evaluations of the quality of (mostly perceptual) decisions, and how these processes might relate to phenomenology and conscious awareness. I use neuroimaging, computational modeling, machine learning and neural stimulation techniques to study these topics. |
|
Jing Zhang Assistant Professor Department of Computer Science University of California, Irvine YouTube Stream: https://youtu.be/HPPq5Xvlr9c The recent advances in sequencing technologies provide unprecedented opportunities to decipher
the multi-scale gene regulatory grammars at diverse cellular states. Here, we will introduce our
computational efforts on cell/gene representation learning to extract biologically meaningful
information from high-dimensional, sparse, and noisy genomic data. First, we proposed a deep
generative model, named SAILER, to learn the low-dimensional latent cell representations from
single-cell epigenetic data for accurate cell state characterization. SAILER adopted the
conventional encoder-decoder framework and imposed additional constraints for biologically
robust cell embeddings invariant to confounding factors. Then at the network level, we
developed TopicNet using latent Dirichlet allocation (LDA) to extract latent gene communities
and quantify regulatory network connectivity changes (network “rewiring”) between diverse cell
states. We applied our TopicNet model on 13 different cancer types and highlighted gene
communities that impact patient prognosis in multiple cancer types.
Bio: Dr. Zhang is an Assistant Professor at UCI. Her research interests are in the areas of bioinformatics and computational biology. She graduated from USC Electrical Engineering under the supervision of Dr. Liang Chen and Dr. C.C Jay Kuo. She completed her postdoc training at Yale University in Dr. Mark Gerstein’s lab. During her postdoc, she has developed several computational methods to integrate novel high-throughput sequencing assays to decipher the gene regulation “grammar”. Her current research focuses on developing computational methods to predict the impact of genomic variations on genome function and phenotype at a single-cell resolution. |
|
May 31 |
No Seminar (Memorial Day)
|
June 7th |
No Seminar (Finals Week)
|