Fall 2019 | Center for Machine Learning and Intelligent Systems

Sep 23	No Seminar
Sep 30 4011 Bren Hall 1 pm	Nia Dowell Assistant Professor School of Education University of California, Irvine Group Communication Analysis: Applications for Online Learning Environments Educational environments have become increasingly reliant on computer-mediated communication, relying on video conferencing, synchronous chats, and asynchronous forums, in both small (5-20 learners) and massive (1000+ learner) learning environments. These platforms, which are designed to support or even supplant traditional instruction, have become common-place across all levels of education, and as a result created big data in education. In order to move forward, the learning sciences field is in need of new automated approaches that offer deeper insights into the dynamics of learner interaction and discourse across online learning platforms. This talk will present results from recent work that uses language and discourse to capture social and cognitive dynamics during collaborative interactions. I will introduce group communication analysis (GCA), a novel approach for detecting emergent learner roles from the participants’ contributions and patterns of interaction. This method makes use of automated computational linguistic analysis of the sequential interactions of participants in online group communication to create distinct interaction profiles. We have applied the GCA to several collaborative learning datasets. Cluster analysis, predictive, and hierarchical linear mixed-effects modeling were used to assess the validity of the GCA approach, and practical influence of learner roles on student and overall group performance. The results indicate that learners’ patterns in linguistic coordination and cohesion are representative of the roles that individuals play in collaborative discussions. More broadly, GCA provides a framework for researchers to explore the micro intra- and inter-personal patterns associated with the participants’ roles and the sociocognitive processes related to successful collaboration. Bio: I am an assistant professor in the School of Education at UCI. My primary interests are in cognitive psychology, discourse processing, group interaction, and learning analytics. In general, my research focuses on using language and discourse to uncover the dynamics of socially significant, cognitive, and affective processes. I am currently applying computational techniques to model discourse and social dynamics in a variety of environments including small group computer-mediated collaborative learning environments, collaborative design networks, and massive open online courses (MOOCs). My research has also extended beyond the educational and learning sciences spaces and highlighted the practical applications of computational discourse science in the clinical, political and social sciences areas.
Oct 7 4011 Bren Hall 1 pm	Shashank Srivastava Assistant Professor Computer Science UNC Chapel Hill Conversational Machine Learning Humans can efficiently learn and communicate new knowledge about the world through natural language (e.g, the concept of important emails may be described through explanations like ‘late night emails from my boss are usually important’). Can machines be similarly taught new tasks and behavior through natural language interactions with their users? In this talk, we’ll explore two approaches towards language-based learning for classifications tasks. First, we’ll consider how language can be leveraged for interactive feature space construction for learning tasks. I’ll present a method that jointly learns to understand language and learn classification models, by using explanations in conjunction with a small number of labeled examples of the concept. Secondly, we’ll examine an approach for using language as a substitute for labeled supervision for training machine learning models, which leverages the semantics of quantifier expressions in everyday language (`definitely’, `sometimes’, etc.) to enable learning in scenarios with limited or no labeled data. Bio: Shashank Srivastava is an assistant professor in the Computer Science department at the University of North Carolina (UNC) Chapel Hill. Shashank received his PhD from the Machine Learning department at CMU in 2018, and was an AI Resident at Microsoft Research in 2018-19. Shashank’s research interests lie in conversational AI, interactive machine learning and grounded language understanding. Shashank has an undergraduate degree in Computer Science from IIT Kanpur, and a Master’s degree in Language Technologies from CMU. He received the Yahoo InMind Fellowship for 2016-17; his research has been covered by popular media outlets including GeekWire and New Scientist.
Oct 14 4011 Bren Hall 1 pm	Bhuwan Dhingra PhD Student Language Technologies Institute Carnegie Mellon University Text as a Virtual Knowledge Base Structured Knowledge Bases (KBs) are extremely useful for applications such as question answering and dialog, but are difficult to populate and maintain. People prefer expressing information in natural language, and hence text corpora, such as Wikipedia, contain more detailed up-to-date information. This raises the question — can we directly treat text corpora as knowledge bases for extracting information on demand? In this talk I will focus on two problems related to this question. First, I will look at augmenting incomplete KBs with textual knowledge for question answering. I will describe a graph neural network model for processing heterogeneous data from the two sources. Next, I will describe a scalable approach for compositional reasoning over the contents of the text corpus, analogous to following a path of relations in a structured KB to answer multi-hop queries. I will conclude by discussing interesting future research directions in this domain. Bio: Bhuwan Dhingra is a final year PhD student at Carnegie Mellon University, advised by William Cohen and Ruslan Salakhutdinov. His research uses natural language processing and machine learning to build an interface between AI applications and world knowledge (facts about people, places and things). His work is supported by the Siemens FutureMakers PhD fellowship. Prior to joining CMU, Bhuwan completed his undergraduate studies at IIT Kanpur in 2013, and spent two years at Qualcomm Research in the beautiful city of San Diego.
Oct 21 4011 Bren Hall 1 pm	Robert Bamler Postdoctoral Researcher Dept. of Computer Science University of California, Irvine Revisiting Variational Expectation Maximization Bayesian inference is often advertised for applications where posterior uncertainties matter. A less appreciated advantage of Bayesian inference is that it allows for highly scalable model selection (“hyperparameter tuning”) via the Expectation Maximization (EM) algorithm and its approximate variant, variational EM. In this talk, I will present both an application and an improvement of variational EM. The application is for link prediction in knowledge graphs, where a probabilistic approach and variational EM allowed us to train highly flexible models with more than ten thousand hyperparameters, improving predictive performance. In the second part of the talk, I will propose a new family of objective functions for variational EM. We will see that existing versions of variational inference in the literature can be interpreted as various forms of biased importance sampling of the marginal likelihood. Combining this insight with ideas from perturbation theory in statistical physics will lead us to a tighter bound on the true marginal likelihood and to better predictive performance of Variational Autoencoders. Bio: Robert Bamler is a Postdoc at UCI in the group of Prof. Stephan Mandt. His interests are probabilistic embedding models, variational inference, and probabilistic deep learning methods for data compression. Before joining UCI in December of 2018, Rob worked in the statistical machine learning group at Disney Research in Pittsburgh and Los Angeles. He received his PhD in theoretical statistical and quantum physics from University of Cologne, Germany.
Oct 28 4011 Bren Hall 1 pm	Zhou Yu Assistant Professor Dept. of Computer Science University of California, Davis Augment intelligence with multimodal information Humans interact with other humans or the world through information from various channels including vision, audio, language, haptics, etc. To simulate intelligence, machines require similar abilities to process and combine information from different channels to acquire better situation awareness, better communication ability, and better decision-making ability. In this talk, we describe three projects. In the first study, we enable a robot to utilize both vision and audio information to achieve better user understanding. Then we use incremental language generation to improve the robot’s communication with a human. In the second study, we utilize multimodal history tracking to optimize policy planning in task-oriented visual dialogs. In the third project, we tackle the well-known trade-off between dialog response relevance and policy effectiveness in visual dialog generation. We propose a new machine learning procedure that alternates from supervised learning and reinforcement learning to optimum language generation and policy planning jointly in visual dialogs. We will also cover some recent ongoing work on image synthesis through dialogs, and generating social multimodal dialogs with a blend of GIF and words. Bio: Zhou Yu is an Assistant Professor at the Computer Science Department at UC Davis. She received her PhD from Carnegie Mellon University in 2017. Zhou is interested in building robust and multi-purpose dialog systems using fewer data points and less annotation. She also works on language generation, vision and language tasks. Zhou’s work on persuasive dialog systems received an ACL 2019 best paper nomination recently. Zhou was featured in Forbes as 2018 30 under 30 in Science for her work on multimodal dialog systems. Her team recently won the 2018 Amazon Alexa Prize on building an engaging social bot for a $500,000 cash award.
Nov 4	Geng Ji PhD Student Dept of Computer Science University of California, Irvine Variational Inference: To Derive or Not To Derive Variational inference provides a general optimization framework to approximate the posterior distributions of latent variables in probabilistic models. Although effective in simple scenarios, it may be inaccurate or infeasible when the data is high-dimensional, the model structure is complicated, or variable relationships are non-conjugate. In this talk, I will present two different strategies to solve these problems. The first one is to derive rigorous variational bounds by leveraging the probabilistic relations and structural dependencies of the given model. One example I will explore is large-scale noisy-OR Bayesian networks popular in IT companies for analyzing the semantic content of massive text datasets. The second strategy is to create flexible algorithms directly applicable to many models, as can be expressed by probabilistic programming systems. I’ll talk about a low-variance Monte Carlo variational inference framework we recently developed for arbitrary models with discrete variables. It has appealing advantages over REINFORCE-style stochastic gradient estimates and model-dependent auxiliary-variable solutions, as demonstrated on real-world models of images, text, and social networks. Bio: Geng Ji is a PhD candidate in the CS Department of UC Irvine, advised by Professor Erik Sudderth. His research interests are broadly in probabilistic graphical models, large-scale variational inference, as well as their applications in computer vision and natural language processing. He did summer internships at Disney Research in 2017 mentored by Professor Stephan Mandt, and Facebook AI in 2018 which he will join as a full-time research scientist.
Nov 11	Veterans Day
Nov 18 4011 Bren Hall 1 pm	John T. Halloran Postdoctoral Researcher Dept. of Biomedical Engineering University of California, Davis Accelerated Machine Learning for Computational Proteomics In the past few decades, mass spectrometry-based proteomics has dramatically improved our fundamental knowledge of biology, leading to advancements in the understanding of diseases and methods for clinical diagnoses. However, the complexity and sheer volume of typical proteomics datasets make both fast and accurate analysis difficult to accomplish simultaneously; while machine learning methods have proven themselves capable of incredibly accurate proteomic analysis, such methods deter use by requiring extremely long runtimes in practice. In this talk, we will discuss two core problems in computational proteomics and how to accelerate the training of their highly accurate, but slow, machine learning solutions. For the first problem, wherein we seek to infer the protein subsequences (called peptides) present in a biological sample, we will improve the training of graphical models by deriving emission functions which render conditional-maximum likelihood learning concave. Used within a dynamic Bayesian network, we show that these emission functions not only allow extremely efficient learning of globally-convergent parameters, but also drastically outperform the state-of-the-art in peptide identification accuracy. For the second problem, wherein we seek to further improve peptide identification accuracy by classifying correct versus incorrect identifications, we will speed up the state-of-the-art in discriminative learning using a combination of improved convex optimization and extensive parallelization. We show that on massive datasets containing hundreds-of-millions of peptide identifications, these speedups reduce discriminative analysis time from several days down to just several hours, without any degradation in analysis quality. Bio: John Halloran is a Postdoc at UC Davis working with Professor David Rocke. He received his PhD from the University of Washington in 2016. John is interested in developing fast and accurate machine learning solutions for massive-scale problems encountered in computational biology. His work regularly focuses on efficient generative and discriminative training of dynamic graphical models. He is a recipient of the UC Davis Award for Excellence in Postdoctoral Research and a UW Genome Training Grant.
Nov 25 4011 Bren Hall 1 pm	Xanda Schofield Assistant Professor Dept. of Computer Science Harvey Mudd College Towards Practical and Locally Private Topic Models A critical challenge in the large-scale analysis of people’s data is protecting the privacy of the people who generated it. Of particular interest is how to privately infer models over discrete count data, like frequencies of words in a message or the number of times two people have interacted. Recently, I helped to develop locally private Bayesian Poisson factorization, a method for differentially private inference for a large family of models of count data, including topic models, stochastic block models, event models, and beyond. However, in the domain of topic models over text, this method can encounter serious obstacles in both speed and model quality. These arise from the collision of high-dimensional, sparse counts of text features in a bag-of-words representation, and dense noise from a privacy mechanism. In this talk, I address several challenges in the space of private statistical model inference over language data, as well as corresponding approaches to produce interpretable models. Bio: Xanda Schofield is an Assistant Professor in Computer Science at Harvey Mudd College. Her work focuses on practical applications of unsupervised models of text, particularly topic models, to research in the humanities and social sciences. More recently, her work has expanded to the intersection of privacy and text mining. She completed her Ph.D. in 2019 at Cornell University advised by David Mimno. In her graduate career, she was the recipient of an NDSEG Fellowship, the Anita Borg Memorial Scholarship, and the Microsoft Graduate Women’s Scholarship. She is also an avid cookie baker and tweets @XandaSchofield.
Dec 2 4011 Bren Hall 1 pm	Shayan Doroudi Assistant Professor School of Education University of California, Irvine Bias, Variance, and the Intertwined Histories of Artificial Intelligence and Education Research This talk will be divided into two parts. In the first part, I will demonstrate that the bias-variance tradeoff in machine learning and statistics can be generalized to offer insights to debates in other scientific fields. In particular, I will show how it can be applied to situate a variety of debates that appear in the education literature. In the second part of my talk, I will give a brief account of how the early history of artificial intelligence was naturally intertwined with the history of education research and the learning sciences. I will use the generalized bias-variance tradeoff as a lens with which to situate different trends that appeared in this history. Today, AI researchers might see education as just another application area, but historically AI and education were integrated into a broader movement to understand and improve intelligence and learning, in humans and in machines. Bio: Shayan Doroudi is an assistant professor at the UC Irvine School of Education. His research is focused on the learning sciences, educational technology, and the educational data sciences. He is particularly interested in studying the prospects and limitations of data-driven algorithms in learning technologies, including lessons that can be drawn from the rich history of educational technology. He earned his B.S. in Computer Science from the California Institute of Technology, and his M.S. and Ph.D. in Computer Science from Carnegie Mellon.
Dec 9	Finals week
Dec 16 4011 Bren Hall 1 pm	Eric Nalisnick Postdoctoral Researcher University of Cambridge/DeepMind Deep Learning Under Covariate Shift Deep neural networks have demonstrated impressive performance in predictive tasks. However, these models have been shown to be brittle, being easily fooled by even small perturbations of the input features (covariates). In this talk, I describe two approaches for handling covariate shift. The first uses a Bayesian prior derived from data augmentation to make the classifier robust to potential test-time shifts. The second strategy is to directly model the covariates using a ‘hybrid model’: a model of the joint distribution over labels and features. In experiments involving this latter approach, we discovered limitations in some existing methods for detecting distributional shift in high-dimensions. I demonstrate that a simple entropy-based goodness-of-fit test can solve some of these issues but conclude by arguing that more investigation is needed. Bio: Eric Nalisnick is a postdoctoral researcher at the University of Cambridge and a part-time research scientist at DeepMind. His research interests span statistical machine learning, with a current emphasis on Bayesian deep learning, generative modeling, and out-of-distribution detection. He received his PhD from the University of California, Irvine, where he was supervised by Padhraic Smyth. Eric has also spent time interning at DeepMind, Twitter, Microsoft, and Amazon.