UCI group develops deep learning approach for Rubik’s cube

Professor Pierre Baldi, with PhD students Forest Agostinelli and Stephen McAleer, and senior Alexander Shmakov, have developed a deep reinforcement learning approach to solve Rubik’s cube problems, solving typical problems in about 20 moves. The work was published in Nature Machine Intelligence.

Fall 2019

Standard

Sep 23	No Seminar
Sep 30 4011 Bren Hall 1 pm	Nia Dowell Assistant Professor School of Education University of California, Irvine Group Communication Analysis: Applications for Online Learning Environments Educational environments have become increasingly reliant on computer-mediated communication, relying on video conferencing, synchronous chats, and asynchronous forums, in both small (5-20 learners) and massive (1000+ learner) learning environments. These platforms, which are designed to support or even supplant traditional instruction, have become common-place across all levels of education, and as a result created big data in education. In order to move forward, the learning sciences field is in need of new automated approaches that offer deeper insights into the dynamics of learner interaction and discourse across online learning platforms. This talk will present results from recent work that uses language and discourse to capture social and cognitive dynamics during collaborative interactions. I will introduce group communication analysis (GCA), a novel approach for detecting emergent learner roles from the participants’ contributions and patterns of interaction. This method makes use of automated computational linguistic analysis of the sequential interactions of participants in online group communication to create distinct interaction profiles. We have applied the GCA to several collaborative learning datasets. Cluster analysis, predictive, and hierarchical linear mixed-effects modeling were used to assess the validity of the GCA approach, and practical influence of learner roles on student and overall group performance. The results indicate that learners’ patterns in linguistic coordination and cohesion are representative of the roles that individuals play in collaborative discussions. More broadly, GCA provides a framework for researchers to explore the micro intra- and inter-personal patterns associated with the participants’ roles and the sociocognitive processes related to successful collaboration. Bio: I am an assistant professor in the School of Education at UCI. My primary interests are in cognitive psychology, discourse processing, group interaction, and learning analytics. In general, my research focuses on using language and discourse to uncover the dynamics of socially significant, cognitive, and affective processes. I am currently applying computational techniques to model discourse and social dynamics in a variety of environments including small group computer-mediated collaborative learning environments, collaborative design networks, and massive open online courses (MOOCs). My research has also extended beyond the educational and learning sciences spaces and highlighted the practical applications of computational discourse science in the clinical, political and social sciences areas.
Oct 7 4011 Bren Hall 1 pm	Shashank Srivastava Assistant Professor Computer Science UNC Chapel Hill Conversational Machine Learning Humans can efficiently learn and communicate new knowledge about the world through natural language (e.g, the concept of important emails may be described through explanations like ‘late night emails from my boss are usually important’). Can machines be similarly taught new tasks and behavior through natural language interactions with their users? In this talk, we’ll explore two approaches towards language-based learning for classifications tasks. First, we’ll consider how language can be leveraged for interactive feature space construction for learning tasks. I’ll present a method that jointly learns to understand language and learn classification models, by using explanations in conjunction with a small number of labeled examples of the concept. Secondly, we’ll examine an approach for using language as a substitute for labeled supervision for training machine learning models, which leverages the semantics of quantifier expressions in everyday language (`definitely’, `sometimes’, etc.) to enable learning in scenarios with limited or no labeled data. Bio: Shashank Srivastava is an assistant professor in the Computer Science department at the University of North Carolina (UNC) Chapel Hill. Shashank received his PhD from the Machine Learning department at CMU in 2018, and was an AI Resident at Microsoft Research in 2018-19. Shashank’s research interests lie in conversational AI, interactive machine learning and grounded language understanding. Shashank has an undergraduate degree in Computer Science from IIT Kanpur, and a Master’s degree in Language Technologies from CMU. He received the Yahoo InMind Fellowship for 2016-17; his research has been covered by popular media outlets including GeekWire and New Scientist.
Oct 14 4011 Bren Hall 1 pm	Bhuwan Dhingra PhD Student Language Technologies Institute Carnegie Mellon University Text as a Virtual Knowledge Base Structured Knowledge Bases (KBs) are extremely useful for applications such as question answering and dialog, but are difficult to populate and maintain. People prefer expressing information in natural language, and hence text corpora, such as Wikipedia, contain more detailed up-to-date information. This raises the question — can we directly treat text corpora as knowledge bases for extracting information on demand? In this talk I will focus on two problems related to this question. First, I will look at augmenting incomplete KBs with textual knowledge for question answering. I will describe a graph neural network model for processing heterogeneous data from the two sources. Next, I will describe a scalable approach for compositional reasoning over the contents of the text corpus, analogous to following a path of relations in a structured KB to answer multi-hop queries. I will conclude by discussing interesting future research directions in this domain. Bio: Bhuwan Dhingra is a final year PhD student at Carnegie Mellon University, advised by William Cohen and Ruslan Salakhutdinov. His research uses natural language processing and machine learning to build an interface between AI applications and world knowledge (facts about people, places and things). His work is supported by the Siemens FutureMakers PhD fellowship. Prior to joining CMU, Bhuwan completed his undergraduate studies at IIT Kanpur in 2013, and spent two years at Qualcomm Research in the beautiful city of San Diego.
Oct 21 4011 Bren Hall 1 pm	Robert Bamler Postdoctoral Researcher Dept. of Computer Science University of California, Irvine Revisiting Variational Expectation Maximization Bayesian inference is often advertised for applications where posterior uncertainties matter. A less appreciated advantage of Bayesian inference is that it allows for highly scalable model selection (“hyperparameter tuning”) via the Expectation Maximization (EM) algorithm and its approximate variant, variational EM. In this talk, I will present both an application and an improvement of variational EM. The application is for link prediction in knowledge graphs, where a probabilistic approach and variational EM allowed us to train highly flexible models with more than ten thousand hyperparameters, improving predictive performance. In the second part of the talk, I will propose a new family of objective functions for variational EM. We will see that existing versions of variational inference in the literature can be interpreted as various forms of biased importance sampling of the marginal likelihood. Combining this insight with ideas from perturbation theory in statistical physics will lead us to a tighter bound on the true marginal likelihood and to better predictive performance of Variational Autoencoders. Bio: Robert Bamler is a Postdoc at UCI in the group of Prof. Stephan Mandt. His interests are probabilistic embedding models, variational inference, and probabilistic deep learning methods for data compression. Before joining UCI in December of 2018, Rob worked in the statistical machine learning group at Disney Research in Pittsburgh and Los Angeles. He received his PhD in theoretical statistical and quantum physics from University of Cologne, Germany.
Oct 28 4011 Bren Hall 1 pm	Zhou Yu Assistant Professor Dept. of Computer Science University of California, Davis Augment intelligence with multimodal information Humans interact with other humans or the world through information from various channels including vision, audio, language, haptics, etc. To simulate intelligence, machines require similar abilities to process and combine information from different channels to acquire better situation awareness, better communication ability, and better decision-making ability. In this talk, we describe three projects. In the first study, we enable a robot to utilize both vision and audio information to achieve better user understanding. Then we use incremental language generation to improve the robot’s communication with a human. In the second study, we utilize multimodal history tracking to optimize policy planning in task-oriented visual dialogs. In the third project, we tackle the well-known trade-off between dialog response relevance and policy effectiveness in visual dialog generation. We propose a new machine learning procedure that alternates from supervised learning and reinforcement learning to optimum language generation and policy planning jointly in visual dialogs. We will also cover some recent ongoing work on image synthesis through dialogs, and generating social multimodal dialogs with a blend of GIF and words. Bio: Zhou Yu is an Assistant Professor at the Computer Science Department at UC Davis. She received her PhD from Carnegie Mellon University in 2017. Zhou is interested in building robust and multi-purpose dialog systems using fewer data points and less annotation. She also works on language generation, vision and language tasks. Zhou’s work on persuasive dialog systems received an ACL 2019 best paper nomination recently. Zhou was featured in Forbes as 2018 30 under 30 in Science for her work on multimodal dialog systems. Her team recently won the 2018 Amazon Alexa Prize on building an engaging social bot for a $500,000 cash award.
Nov 4	Geng Ji PhD Student Dept of Computer Science University of California, Irvine Variational Inference: To Derive or Not To Derive Variational inference provides a general optimization framework to approximate the posterior distributions of latent variables in probabilistic models. Although effective in simple scenarios, it may be inaccurate or infeasible when the data is high-dimensional, the model structure is complicated, or variable relationships are non-conjugate. In this talk, I will present two different strategies to solve these problems. The first one is to derive rigorous variational bounds by leveraging the probabilistic relations and structural dependencies of the given model. One example I will explore is large-scale noisy-OR Bayesian networks popular in IT companies for analyzing the semantic content of massive text datasets. The second strategy is to create flexible algorithms directly applicable to many models, as can be expressed by probabilistic programming systems. I’ll talk about a low-variance Monte Carlo variational inference framework we recently developed for arbitrary models with discrete variables. It has appealing advantages over REINFORCE-style stochastic gradient estimates and model-dependent auxiliary-variable solutions, as demonstrated on real-world models of images, text, and social networks. Bio: Geng Ji is a PhD candidate in the CS Department of UC Irvine, advised by Professor Erik Sudderth. His research interests are broadly in probabilistic graphical models, large-scale variational inference, as well as their applications in computer vision and natural language processing. He did summer internships at Disney Research in 2017 mentored by Professor Stephan Mandt, and Facebook AI in 2018 which he will join as a full-time research scientist.
Nov 11	Veterans Day
Nov 18 4011 Bren Hall 1 pm	John T. Halloran Postdoctoral Researcher Dept. of Biomedical Engineering University of California, Davis Accelerated Machine Learning for Computational Proteomics In the past few decades, mass spectrometry-based proteomics has dramatically improved our fundamental knowledge of biology, leading to advancements in the understanding of diseases and methods for clinical diagnoses. However, the complexity and sheer volume of typical proteomics datasets make both fast and accurate analysis difficult to accomplish simultaneously; while machine learning methods have proven themselves capable of incredibly accurate proteomic analysis, such methods deter use by requiring extremely long runtimes in practice. In this talk, we will discuss two core problems in computational proteomics and how to accelerate the training of their highly accurate, but slow, machine learning solutions. For the first problem, wherein we seek to infer the protein subsequences (called peptides) present in a biological sample, we will improve the training of graphical models by deriving emission functions which render conditional-maximum likelihood learning concave. Used within a dynamic Bayesian network, we show that these emission functions not only allow extremely efficient learning of globally-convergent parameters, but also drastically outperform the state-of-the-art in peptide identification accuracy. For the second problem, wherein we seek to further improve peptide identification accuracy by classifying correct versus incorrect identifications, we will speed up the state-of-the-art in discriminative learning using a combination of improved convex optimization and extensive parallelization. We show that on massive datasets containing hundreds-of-millions of peptide identifications, these speedups reduce discriminative analysis time from several days down to just several hours, without any degradation in analysis quality. Bio: John Halloran is a Postdoc at UC Davis working with Professor David Rocke. He received his PhD from the University of Washington in 2016. John is interested in developing fast and accurate machine learning solutions for massive-scale problems encountered in computational biology. His work regularly focuses on efficient generative and discriminative training of dynamic graphical models. He is a recipient of the UC Davis Award for Excellence in Postdoctoral Research and a UW Genome Training Grant.
Nov 25 4011 Bren Hall 1 pm	Xanda Schofield Assistant Professor Dept. of Computer Science Harvey Mudd College Towards Practical and Locally Private Topic Models A critical challenge in the large-scale analysis of people’s data is protecting the privacy of the people who generated it. Of particular interest is how to privately infer models over discrete count data, like frequencies of words in a message or the number of times two people have interacted. Recently, I helped to develop locally private Bayesian Poisson factorization, a method for differentially private inference for a large family of models of count data, including topic models, stochastic block models, event models, and beyond. However, in the domain of topic models over text, this method can encounter serious obstacles in both speed and model quality. These arise from the collision of high-dimensional, sparse counts of text features in a bag-of-words representation, and dense noise from a privacy mechanism. In this talk, I address several challenges in the space of private statistical model inference over language data, as well as corresponding approaches to produce interpretable models. Bio: Xanda Schofield is an Assistant Professor in Computer Science at Harvey Mudd College. Her work focuses on practical applications of unsupervised models of text, particularly topic models, to research in the humanities and social sciences. More recently, her work has expanded to the intersection of privacy and text mining. She completed her Ph.D. in 2019 at Cornell University advised by David Mimno. In her graduate career, she was the recipient of an NDSEG Fellowship, the Anita Borg Memorial Scholarship, and the Microsoft Graduate Women’s Scholarship. She is also an avid cookie baker and tweets @XandaSchofield.
Dec 2 4011 Bren Hall 1 pm	Shayan Doroudi Assistant Professor School of Education University of California, Irvine Bias, Variance, and the Intertwined Histories of Artificial Intelligence and Education Research This talk will be divided into two parts. In the first part, I will demonstrate that the bias-variance tradeoff in machine learning and statistics can be generalized to offer insights to debates in other scientific fields. In particular, I will show how it can be applied to situate a variety of debates that appear in the education literature. In the second part of my talk, I will give a brief account of how the early history of artificial intelligence was naturally intertwined with the history of education research and the learning sciences. I will use the generalized bias-variance tradeoff as a lens with which to situate different trends that appeared in this history. Today, AI researchers might see education as just another application area, but historically AI and education were integrated into a broader movement to understand and improve intelligence and learning, in humans and in machines. Bio: Shayan Doroudi is an assistant professor at the UC Irvine School of Education. His research is focused on the learning sciences, educational technology, and the educational data sciences. He is particularly interested in studying the prospects and limitations of data-driven algorithms in learning technologies, including lessons that can be drawn from the rich history of educational technology. He earned his B.S. in Computer Science from the California Institute of Technology, and his M.S. and Ph.D. in Computer Science from Carnegie Mellon.
Dec 9	Finals week
Dec 16 4011 Bren Hall 1 pm	Eric Nalisnick Postdoctoral Researcher University of Cambridge/DeepMind Deep Learning Under Covariate Shift Deep neural networks have demonstrated impressive performance in predictive tasks. However, these models have been shown to be brittle, being easily fooled by even small perturbations of the input features (covariates). In this talk, I describe two approaches for handling covariate shift. The first uses a Bayesian prior derived from data augmentation to make the classifier robust to potential test-time shifts. The second strategy is to directly model the covariates using a ‘hybrid model’: a model of the joint distribution over labels and features. In experiments involving this latter approach, we discovered limitations in some existing methods for detecting distributional shift in high-dimensions. I demonstrate that a simple entropy-based goodness-of-fit test can solve some of these issues but conclude by arguing that more investigation is needed. Bio: Eric Nalisnick is a postdoctoral researcher at the University of Cambridge and a part-time research scientist at DeepMind. His research interests span statistical machine learning, with a current emphasis on Bayesian deep learning, generative modeling, and out-of-distribution detection. He received his PhD from the University of California, Irvine, where he was supervised by Padhraic Smyth. Eric has also spent time interning at DeepMind, Twitter, Microsoft, and Amazon.

Spring 2019

Standard

Apr 8	No Seminar
Apr 15 Bren Hall 4011 1 pm	Daeyun Shin PhD Candidate Dept of Computer Science UC Irvine Multi-layer Depth and Epipolar Feature Transformers for 3D Scene Reconstruction In this presentation, I will present our approach to the problem of automatically reconstructing a complete 3D model of a scene from a single RGB image. This challenging task requires inferring the shape of both visible and occluded surfaces. Our approach utilizes viewer-centered, multi-layer representation of scene geometry adapted from recent methods for single object shape completion. To improve the accuracy of view-centered representations for complex scenes, we introduce a novel “Epipolar Feature Transformer” that transfers convolutional network features from an input view to other virtual camera viewpoints, and thus better covers the 3D scene geometry. Unlike existing approaches that first detect and localize objects in 3D, and then infer object shape using category-specific models, our approach is fully convolutional, end-to-end differentiable, and avoids the resolution and memory limitations of voxel representations. We demonstrate the advantages of multi-layer depth representations and epipolar feature transformers on the reconstruction of a large database of indoor scenes. Project page: https://www.ics.uci.edu/~daeyuns/layered-epipolar-cnn/
Apr 22 Bren Hall 4011 1 pm	Mike Pritchard Assistant Professor Dept. of Earth System Sciences University of California, Irvine Improving global climate simulations using physically constrained deep learning emulators of unresolved moist turbulence processes I will discuss machine-learning emulation of O(100M) cloud-resolving simulations of moist turbulence for use in multi-scale global climate simulation. First, I will present encouraging results from pilot tests on an idealized ocean-world, in which a fully connected deep neural network (DNN) is found to be capable of emulating explicit subgrid vertical heat and vapor transports across a globally diverse population of convective regimes. Next, I will demonstrate that O(10k) instances of the DNN emulator spanning the world are able to feed back realistically with a prognostic global host atmospheric model, producing viable ML-powered climate simulations that exhibit realistic space-time variability for convectively coupled weather dynamics and even some limited out-of-sample generalizability to new climate states beyond the training data’s boundaries. I will then discuss a new prototype of the neural network under development that includes the ability to enforce multiple physical constraints within the DNN optimization process, which exhibits potential for further generalizability. Finally, I will conclude with some discussion of the unsolved technical issues and interesting philosophical tensions being raised in the climate modeling community by this disruptive but promising approach for next-generation global simulation.
Apr 29 Bren Hall 4011 1 pm	Nick Gallo PhD Candidate Department of Computer Science University of California, Irvine Coarse to Fine Lifted Inference Large problems with repetitive sub-structure arise in many domains such as social network analysis, collective classification, and database entity resolution. In these instances, individual data is augmented with a small set of rules that uniformly govern the relationship among groups of objects (for example: “the friend of my friend is probably my friend” in a social network). Uncertainty is captured by a probabilistic graphical model structure. While theoretically sound, standard reasoning techniques cannot be applied due to the massive size of the network (often millions of random variable and trillions of factors). Previous work on lifted inference efficiently exploits symmetric structure in graphical models, but breaks down in the presence of unique individual data (contained in all real-world problems). Current methods to address this problem are largely heuristic. In this presentation we describe a coarse to fine approximate inference framework that initially treats all individuals identically, gradually relaxing this restriction to finer sub-groups. This produces a sequence of inference objective bounds of monotonically increasing cost and accuracy. We then discuss our work on incorporating high-order inference terms (over large subsets of variables) into lifted inference and ongoing challenges in this area.
May 13 Bren Hall 4011 1 pm	Matt Gardner Senior Research Scientist Allen Institute of Artificial Intelligence Reasoning Our Way to Reading Reading machines that truly understood what they read would change the world, but our current best reading systems struggle to understand text at anything more than a superficial level. In this talk I try to reason out what it means to “read”, and how reasoning systems might help us get there. I will introduce three reading comprehension datasets that require systems to reason at a deeper level about the text that they read, using numerical, coreferential, and implicative reasoning abilities. I will also describe some early work on models that can perform these kinds of reasoning. Bio: Matt is a senior research scientist at the Allen Institute for Artificial Intelligence (AI2) on the AllenNLP team, and a visiting scholar at UCI. His research focuses primarily on getting computers to read and answer questions, dealing both with open domain reading comprehension and with understanding question semantics in terms of some formal grounding (semantic parsing). He is particularly interested in cases where these two problems intersect, doing some kind of reasoning over open domain text. He is the original author of the AllenNLP toolkit for NLP research, and he co-hosts the NLP Highlights podcast with Waleed Ammar.
May 27	No Seminar (Memorial Day)
June 3 Bren Hall 4011 12:00	Peter Sadowski Assistant Professor Information and Computer Sciences University of Hawaii Manoa Deep Learning for Extreme Remote Sensing: from Ocean Waves to Exocomets New technologies for remote sensing and astronomy provide an unprecedented view of Earth, our Sun, and beyond. Traditional data-analysis pipelines in oceanography, atmospheric sciences, and astronomy struggle to take full advantage of the massive amounts of high-dimensional data now available. I will describe opportunities for using deep learning to process satellite and telescope data, and discuss recent work mapping extreme sea states using Satellite Aperture Radar (SAR), inferring the physics of our sun’s atmosphere, and detecting anomalous astrophysical events in other systems, such as comets transiting distant stars. Bio: Peter Sadowski is an Assistant Professor of Information and Computer Sciences at the University of Hawaii Manoa and Co-Director of the AI Precision Health Institute at the University of Hawaii Cancer Center. He completed his Ph.D. and Postdoc at University of California Irvine, and his undergraduate studies at Caltech. His research focuses on deep learning and its applications to the natural sciences, particularly those at the intersection of machine learning and physics.
June 3 Bren Hall 4011 1 pm	Max Welling Research Chair, University of Amsterdam VP Technologies, Qualcomm Integrating Generative Modeling into Deep Learning Deep learning has boosted the performance of many applications tremendously, such as object classification and detection in images, speech recognition and understanding, machine translation, game play such as chess and go etc. However, these all constitute reasonably narrowly and well defined tasks for which it is reasonable to collect very large datasets. For artificial general intelligence (AGI) we will need to learn from a small number of samples, generalize to entirely new domains, and reason about a problem. What do we need in order to make progress to AGI? I will argue that we need to combine the data generating process, such as the physics of the domain and the causal relationships between objects, with the tools of deep learning. In this talk I will present a first attempt to integrate the theory of graphical models, which arguably was the dominating modeling machine learning paradigm around the turn of the twenty-first century, with deep learning. Graphical models express the relations between random variables in an interpretable way, while probabilistic inference in such networks can be used to reason about these variables. We will propose a new hybrid paradigm where probabilistic message passing in such networks is enhanced with graph convolutional neural networks to improve the ability of such systems to reason and make predictions.
June 10	No Seminar (Finals)

Faculty Positions at UC Irvine

Standard

Faculty Positions at UC Irvine

Application deadline: Jan 15th, 2019 (Applications received by January 1, 2019 will receive fullest consideration.)

Apply online at: https://recruit.ap.uci.edu/apply/JPF04950

The Department of Computer Science in the Donald Bren School of Information and Computer Sciences (ICS) at the University of California, Irvine (UCI) invites applications for multiple tenure-track assistant professor or tenured associate/full professor positions beginning July 1, 2019. The Department is interested in individuals with research interests in all aspects of algorithms, artificial intelligence, machine learning, and theory of computing. One opening is targeted at individuals whose computer science expertise aligns with the growing UCI Data Science Initiative.

Fall 2018

Standard

Oct 1	No Seminar
Oct 8 Bren Hall 4011 1 pm	Matt Gardner Research Scientist Allen Institute for AI A Tale of Two Question Answering Systems The path to natural language understanding goes through increasingly challenging question answering tasks. I will present research that significantly improves performance on two such tasks: answering complex questions over tables, and open-domain factoid question answering. For answering complex questions, I will present a type-constrained encoder-decoder neural semantic parser that learns to map natural language questions to programs. For open-domain factoid QA, I will show that training paragraph-level QA systems to give calibrated confidence scores across paragraphs is crucial when the correct answer-containing paragraph is unknown. I will conclude with some thoughts about how to combine these two disparate QA paradigms, towards the goal of answering complex questions over open-domain text. Bio:Matt Gardner is a research scientist at the Allen Institute for Artificial Intelligence (AI2), where he has been exploring various kinds of question answering systems. He is the lead designer and maintainer of the AllenNLP toolkit, a platform for doing NLP research on top of pytorch. Matt is also the co-host of the NLP Highlights podcast, where, with Waleed Ammar, he gets to interview the authors of interesting NLP papers about their work. Prior to joining AI2, Matt earned a PhD from Carnegie Mellon University, working with Tom Mitchell on the Never Ending Language Learning project.
Oct 22 Bren Hall 4011 1 pm	Stephan Mandt Assistant Professor Dept. of Computer Science UC Irvine Deep Probabilistic Modeling I will give an overview of some exciting recent developments in deep probabilistic modeling, which combines deep neural networks with probabilistic models for unsupervised learning. Deep probabilistic models are capable of synthesizing artificial data that highly resemble the training data, and are able fool both machine learning classifiers as well as humans. These models have numerous applications in creative tasks, such as voice, image, or video synthesis and manipulation. At the same time, combining neural networks with strong priors results in flexible yet highly interpretable models for finding hidden structure in large data sets. I will summarize my group’s activities in this space, including measuring semantic shifts of individual words over hundreds of years, summarizing audience reactions to movies, and predicting the future evolution of video sequences with applications to neural video coding.
Oct 25 Bren Hall 3011 3 pm	(Note: different day (Thurs), time (3pm), and location (3011) relative to usual Monday seminars) Steven Wright Professor Department of Computer Sciences University of Wisconsin, Madison Optimization in Data Science Many of the computational problems that arise in data analysis and machine learning can be expressed mathematically as optimization problems. Indeed, much new algorithmic research in optimization is being driven by the need to solve large, complex problems from these areas. In this talk, we review a number of canonical problems in data analysis and their formulations as optimization problems. We will cover support vector machines / kernel learning, logistic regression (including regularized and multiclass variants), matrix completion, deep learning, and several other paradigms.
Oct 29 Bren Hall 4011 1 pm	Alex Psomas Postdoctoral Researcher Computer Science Department Carnegie Mellon University Fair Resource Allocation: From Theory to Practice We study the problem of fairly allocating a set of indivisible items among $n$ agents. Typically, the literature has focused on one-shot algorithms. In this talk we depart from this paradigm and allow items to arrive online. When an item arrives we must immediately and irrevocably allocate it to an agent. A paradigmatic example is that of food banks: food donations arrive, and must be delivered to nonprofit organizations such as food pantries and soup kitchens. Items are often perishable, which is why allocation decisions must be made quickly, and donated items are typically leftovers, leading to lack of information about items that will arrive in the future. Which recipient should a new donation go to? We approach this problem from different angles. In the first part of the talk, we study the problem of minimizing the maximum envy between any two recipients, after all the goods have been allocated. We give a polynomial-time, deterministic and asymptotically optimal algorithm with vanishing envy, i.e. the maximum envy divided by the number of items T goes to zero as T goes to infinity. In the second part of the talk, we adopt and further develop an emerging paradigm called virtual democracy. We will take these ideas all the way to practice. In the last part of the talk I will present some results from an ongoing work on automating the decisions faced by a food bank called 412 Food Rescue, an organization in Pittsburgh that matches food donations with non-profit organizations.
Nov 5 Bren Hall 4011 1 pm	Fred Park Associate Professor Dept of Math & Computer Science Whittier College Image Segmentation and Tracking Utilizing a Difference of Convex Regularized Mumford-Shah Functional In this talk I will give a brief overview of the segmentation and tracking problems and will propose a new model that tackles both of them. This model incorporates a weighted difference of anisotropic and isotropic total variation (TV) norms into a relaxed formulation of the Mumford-Shah (MS) model. We will show results exceeding those obtained by the MS model when using the standard TV norm to regularize partition boundaries. Examples illustrating the qualitative differences between the proposed model and the standard MS one will be shown as well. I will also talk about a fast numerical method that is used to optimize the proposed model utilizing the difference-of-convex algorithm (DCA) and the primal dual hybrid gradient (PDHG) method. Finally, future directions will be given that could harness the power of convolution nets for more advanced segmentation tasks.
Nov 12	No Seminar (Veterans Day)
Nov 19 Bren Hall 4011 1 pm	Philip Nelson Director of Engineering Google Research Accelerating bio discovery with machine learning, the promise and the peril Google Accelerated Sciences is a translational research team that brings Google’s technological expertise to the scientific community. Recent advances in machine learning have delivered incredible results in consumer applications (e.g. photo recognition, language translation), and is now beginning to play an important role in life sciences. Taking examples from active collaborations in the biochemical, biological, and biomedical fields, I will focus on how our team transforms science problems into data problems and applies Google’s scaled computation, data-driven engineering, and machine learning to accelerate discovery. See http://g.co/research/gas for our publications and more details. Bio: Philip Nelson is a Director of Engineering in Google Research. He joined Google in 2008 and was previously responsible for a range of Google applications and geo services. In 2013, he helped found and currently leads the Google Accelerated Science team that collaborates with academic and commercial scientists to apply Google’s knowledge and experience and technologies to important scientific problems. Philip graduated from MIT in 1985 where he did award-winning research on hip prosthetics at Harvard Medical School. Before Google, Philip helped found and lead several Silicon Valley startups in search (Verity), optimization (Impresse), and genome sequencing (Complete Genomics) and was also an Entrepreneur in Residence at Accel Partners.
Nov 26 Bren Hall 4011 1 pm	Richard Futrell Assistant Professor Dept of Language Science UC Irvine Natural language as a code: Modeling human language using information theory Why is natural language the way it is? I propose that human languages can be modeled as solutions to the problem of efficient communication among intelligent agents with certain information processing constraints, in particular constraints on short-term memory. I present an analysis of dependency treebank corpora of over 50 languages showing that word orders across languages are optimized to limit short-term memory demands in parsing. Next I develop a Bayesian, information-theoretic model of human language processing, and show that this model can intuitively explain an apparently paradoxical class of comprehension errors made by both humans and state-of-the-art recurrent neural networks (RNNs). Finally I combine these insights in a model of human languages as information-theoretic codes for latent tree structures, and show that optimization of these codes for expressivity and compressibility results in grammars that resemble human languages.
Dec 3	No Seminar (NIPS)

Two new NSF awards in Machine Learning for Sameer Singh

Standard

Congratulations to Professor Sameer Singh for receiving two multi-year research awards from the National Science Foundation (NSF). Under the first grant, Sameer and his research team will develop new algorithms to better explain why classifiers make certain decisions, increasing user trust in such models. The second grant focuses on the development of new approached for extracting multimodal information from documents, such as text, images, numbers, and databases, with the goal of automatically creating new knowledge bases from relatively unstructured written documents.

Spring 2018

Standard

Apr 2	No Seminar
Apr 9 Bren Hall 4011 1 pm	Sabino Miranda, Ph.D CONACyT Researcher Center for Research and Innovation in Information and Communication Technologies Towards a Multilingual and Error-Robust Approach for Sentiment Analysis Sentiment Analysis is a research area concerned with the computational analysis of people’s feelings or beliefs expressed in texts such as emotions, opinions, attitudes, appraisals, etc. At the same time, with the growth of social media data (review websites, microblogging sites, etc.) on the Web, Twitter has received particular attention because it is a huge source of opinionated information with potential applications to decision-making tasks from business applications to the analysis of social and political events. In this context, I will present the multilingual and error-robust approaches developed in our group to tackle sentiment analysis as a classification problem, mainly for informal written text such as Twitter. Our approaches have been tested in several benchmark contests such as SemEval (International Workshop on Semantic Evaluation), TASS (Workshop for Sentiment Analysis Focused on Spanish), and PAN (Workshop on Digital Text Forensics).
Apr 16 Bren Hall 4011 1 pm	Roman Vershynin Professor of Mathematics University of California, Irvine Boolean functions and random tensors A simple way to generate a Boolean function in n variables is to take the sign of some polynomial. Such functions are called polynomial threshold functions. How many low-degree polynomial threshold functions are there? This problem was solved for degree d=1 by Zuev in 1989 and has remained open for any higher degrees, including d=2, since then. In a joint work with Pierre Baldi (UCI), we settled the problem for all degrees d>1. The solution explores connections of Boolean functions to additive combinatorics and high-dimensional probability. This leads to a program of extending random matrix theory to random tensors, which is mostly an uncharted territory at present.
Apr 23 Bren Hall 4011 1 pm	Zhile Ren PhD Candidate, Computer Science Brown University Semantic Three-Dimensional Understanding of Dynamic Scenes We develop new representations and algorithms for three-dimensional (3D) scene understanding from images and videos. In cluttered indoor scenes, RGB-D images are typically described by local geometric features of the 3D point cloud. We introduce descriptors that account for 3D camera viewpoint, and use structured learning to perform 3D object detection and room layout prediction. We also extend this work by using latent support surfaces to capture style variations of 3D objects and help detect small objects. Contextual relationships among categories and layout are captured via a cascade of classifiers, leading to holistic scene hypotheses with improved accuracy. In outdoor autonomous driving applications, given two consecutive frames from a pair of stereo cameras, 3D scene flow methods simultaneously estimate the 3D geometry and motion of the observed scene. We incorporate semantic segmentation in a cascaded prediction framework to more accurately model moving objects by iteratively refining segmentation masks, stereo correspondences, 3D rigid motion estimates, and optical flow fields.
Apr 30	Cancelled
May 7 Bren Hall 4011 1 pm	Vivek Srikumar Assistant Professor University of Utah Natural Language Processing in the Wild: Opportunities & Challenges Natural language processing (NLP) sees potential applicability in a broad array of user-facing applications. To realize this potential, however, we need to address several challenges related to representations, data availability and scalability. In this talk, I will discuss these concerns and how we may overcome them. First, as a motivating example of NLP’s broad reach, I will present our recent work on using language technology to improve mental health treatment. Then, I will focus on some of the challenges that need to be addressed. The choice of representations can make a big difference in our ability to reason about text; I will discuss recent work on developing rich semantic representations. Finally, I will touch upon the problem of systematically speeding up the entire NLP pipeline without sacrificing accuracy. As a concrete example, I will present a new algebraic characterization of the process of feature extraction, as a direct consequence of which, we can make trained classifiers significantly faster.
May 14 Bren Hall 4011 1 pm	Shu Kong PhD Candidate, Computer Science University of California, Irvine Pay Attention to the Pixel, Understand the Scene Better Objects may appear at arbitrary scales in perspective images of a scene, posing a challenge for recognition systems that process images at a fixed resolution. We propose a depth-aware gating module that adaptively selects the pooling field size (by fusing multi-scale pooled features) in a convolutional network architecture according to the object scale (inversely proportional to the depth) so that small details are preserved for distant objects while larger receptive fields are used for those nearby. The depth gating signal is provided by stereo disparity or estimated directly from monocular input. We further integrate this depth-aware gating into a recurrent convolutional neural network to refine semantic segmentation, and show state-of-the-art performance on several benchmarks. Moreover, rather than fusing mutli-scale pooled features based on estimated depth, we show the “correct” size of pooling field for each pixel can be decided in an attentional fashion by our Pixel-wise Attentional Gating unit (PAG), which learns to choose the pooling size for each pixel. PAG is a generic, architecture-independent, problem-agnostic mechanism that can be readily “plugged in” to an existing model with fine-tuning. We utilize PAG in two ways: 1) learning spatially varying pooling fields that improves model performance without the extra computation cost, and 2) learning a dynamic computation policy for each pixel to decrease total computation while maintaining accuracy. We extensively evaluate PAG on a variety of per-pixel labeling tasks, including semantic segmentation, boundary detection, monocular depth and surface normal estimation. We demonstrate that PAG allows competitive or state-of-the-art performance on these tasks. We also show that PAG learns dynamic spatial allocation of computation over the input image which provides better performance trade-offs compared to related approaches (e.g., truncating deep models or dynamically skipping whole layers). Generally, we observe that PAG reduces computation by 10% without noticeable loss in accuracy, and performance degrades gracefully when imposing stronger computational constraints.
May 21 Bren Hall 4011 1 pm	Rich Caruana Principal Researcher Microsoft Research Friends Don’t Let Friends Deploy Black-Box Models: The Importance of Intelligibility in Machine Learning In machine learning often a tradeoff must be made between accuracy and intelligibility: the most accurate models usually are not very intelligible (e.g., deep nets, boosted trees and random forests), and the most intelligible models usually are less accurate (e.g., logistic regression and decision lists). This tradeoff often limits the accuracy of models that can be safely deployed in mission-critical applications such as healthcare where being able to understand, validate, edit, and ultimately trust a learned model is important. We have been working on a learning method based on generalized additive models (GAMs) that is often as accurate as full complexity models, but even more intelligible than linear models. This makes it easy to understand what a model has learned, and also makes it easier to edit the model when it learns inappropriate things because of unanticipated problems with the data. Making it possible for experts to understand a model and repair it is critical because most data has unanticipated landmines. In the talk I’ll present two healthcare cases studies where these high-accuracy GAMs discover surprising patterns in the data that would have made deploying a black-box model risky. I’ll also briefly show how we’re using these models to detect bias in domains where fairness and transparency are paramount.
May 28	Memorial Day
Jun 4 Bren Hall 4011 1 pm	Stephen McAleer (Pierre Baldi‘s group) Graduate Student, Computer Science University of California, Irvine Learning how to solve the Rubik's Cube with no Human Knowledge We will present a novel approach to solving the Rubik’s cube effectively without any human knowledge using several ingredients including deep learning, reinforcement learning, and Monte Carlo searches. At the end, if time permits, we will describe several extensions to the neuronal Boolean complexity results presented by Roman Vershynin a few weeks ago.
Jun 11	No Seminar (finals week)

Workshop for the Philosophy of Machine Learning

Standard

UC Irvine held a very successful workshop on the “Philosophy of Machine Learning” on March 17th & 18th, in the Donald Bren Hall Conference Center (DBH 6011). More information may be found at: https://philmachinelearning.wordpress.com/program/.

Organizers: Andrew Holbrook (Statistics) and Kino Zhao (Logic and Philosophy of Science)

Sponsors: UCI School of Social Sciences; UCI Dept of Logic & Philosophy of Science; UCI Data Science Initiative; and Dr. Babak Shahbaba (via NSF).

Winter 2018

Standard

Jan 15	No Seminar (MLK Day)
Jan 22 Bren Hall 4011 1 pm	Shufeng Kong PhD Candidate Centre for Quantum Software and Information, FEIT University of Technology Sydney, Australia Multiagent Simple Temporal Problem: The Arc-Consistency Approach The Simple Temporal Problem (STP) is a fundamental temporal reasoning problem and has recently been extended to the Multiagent Simple Temporal Problem (MaSTP). In this paper we present a novel approach that is based on enforcing arc-consistency (AC) on the input (multiagent) simple temporal network. We show that the AC-based approach is sufficient for solving both the STP and MaSTP and provide efficient algorithms for them. As our AC-based approach does not impose new constraints between agents, it does not violate the privacy of the agents and is superior to the state-ofthe-art approach to MaSTP. Empirical evaluations on diverse benchmark datasets also show that our AC-based algorithms for STP and MaSTP are significantly more efficient than existing approaches.
Jan 29 Bren Hall 4011 1 pm	Yangfeng Ji Postdoctoral Scholar Paul Allen School of Computer Science and Engineering University of Washington Bringing Structural Information into Neural Network Design Deep learning is one of the most important techniques used in natural language processing (NLP). A central question in deep learning for NLP is how to design a neural network that can fully utilize the information from training data and make accurate predictions. A key to solving this problem is to design a better network architecture. In this talk, I will present two examples from my work on how structural information from natural language helps design better neural network models. The first example shows adding coreference structures of entities not only helps different aspects of text modeling, but also improves the performance of language generation; the second example demonstrates structures of organizing sentences into coherent texts can help neural networks build better representations for various text classification tasks. Along the lines of this topic, I will also propose a few ideas for future work and discuss some potential challenges.
February 5	No Seminar (AAAI)
February 12 Bren Hall 4011 1 pm	Eric Nalisnick PhD Candidate Computer Science University of California, Irvine Averaging and Combining Variational Models with Stein Particle Descent Bayesian inference for complex models—the kinds needed to solve complex tasks such as object recognition—is inherently intractable, requiring analytically difficult integrals be solved in high dimensions. One solution is to turn to variational Bayesian inference: a parametrized family of distributions is proposed, and optimization is carried out to find the member of the family nearest to the true posterior. There is an innate trade-off within VI between expressive vs tractable approximations. We wish the variational family to be as rich as possible so as it might include the true posterior (or something very close), but adding structure to the approximation increases the computational complexity of optimization. As a result, there has been much interest in efficient optimization strategies for mixture model approximations. In this talk, I’ll return to the problem of using mixture models for VI. First, to motivate our approach, I’ll discuss the distinction between averaging vs combining variational models. We show that optimization objectives aimed at fitting mixtures (i.e. model combination), in practice, are relaxed into performing something between model combination and averaging. Our primary contribution is to formulate a novel training algorithm for variational model averaging by adapting Stein variational gradient descent to operate on the parameters of the approximating distribution. Then, through a particular choice of kernel, we show the algorithm can be adapted to perform something closer to model combination, providing a new algorithm for optimizing (finite) mixture approximations.
February 19	No Seminar (President’s Day)
February 26 Bren Hall 4011 1 pm	Jay Pujara Research Scientist ISI/USC What do Probabilistic Models Know? Knowledge is an essential ingredient in the quest for artificial intelligence, yet scalable and robust approaches to acquiring knowledge have challenged AI researchers for decades. Often, the obstacle to knowledge acquisition is massive, uncertain, and changing data that obscures the underlying knowledge. In such settings, probabilistic models have excelled at exploiting the structure in the domain to overcome ambiguity, revise beliefs and produce interpretable results. In my talk, I will describe recent work using probabilistic models for knowledge graph construction and information extraction, including linking subjects across electronic health records, fusing background knowledge from scientific articles with gene association studies, disambiguating user browsing behavior across platforms and devices, and aligning structured data sources with textual summaries. I also highlight several areas of ongoing research, fusing embedding approaches with probabilistic modeling and building models that support dynamic data or human-in-the-loop interactions. Bio: Jay Pujara is a research scientist at the University of Southern California’s Information Sciences Institute whose principal areas of research are machine learning, artificial intelligence, and data science. He completed a postdoc at UC Santa Cruz, earned his PhD at the University of Maryland, College Park and received his MS and BS at Carnegie Mellon University. Prior to his PhD, Jay spent six years at Yahoo! working on mail spam detection, user trust, and contextual mail experiences, and he has also worked at Google, LinkedIn and Oracle. Jay is the author of over thirty peer-reviewed publications and has received three best paper awards for his work. He is a recognized authority on knowledge graphs, and has organized the Automatic Knowledge Base Construction (AKBC) and Statistical Relational AI (StaRAI) workshops, has presented tutorials on knowledge graph construction at AAAI and WSDM, and has had his work featured in AI Magazine.
March 5 Bren Hall 4011 1 pm	Vagelis Papalexakis Assistant Professor UC Riverside Tensor Decompositions for Big Multi-aspect Data Analytics Tensors and tensor decompositions have been very popular and effective tools for analyzing multi-aspect data in a wide variety of fields, ranging from Psychology to Chemometrics, and from Signal Processing to Data Mining and Machine Learning. Using tensors in the era of big data presents us with a rich variety of applications, but also poses great challenges such as the one of scalability and efficiency. In this talk I will first motivate the effectiveness of tensor decompositions as data analytic tools in a variety of exciting, real-world applications. Subsequently, I will discuss recent techniques on tackling the scalability and efficiency challenges by parallelizing and speeding up tensor decompositions, especially for very sparse datasets, including the scenario where the data are continuously updated over time. Finally, I will discuss open problems in unsupervised tensor mining and quality assessment of the results, and present work-in-progress addressing that problem with very encouraging results.
March 12 Bren Hall 4011 1 pm	Alessandro Achille PhD Student UC Los Angeles The Emergence Theory of Deep Learning: Perception, Information Theory and PAC-Bayes I will describe the basic elements of the Emergence Theory of Deep Learning, that started as a general theory for representations, and is comprised of three parts: (1) We formalize the desirable properties that a representation should possess, based on classical principles of statistical decision and information theory: invariance, sufficiency, minimality, disentanglement. We then show that such an optimal representation of the data can be learned by minimizing a specific loss function which is related to the notion of Information Bottleneck and Variational Inference. (2) We analyze common empirical losses employed in Deep Learning (such as empirical cross-entropy), and implicit or explicit regularizers, including Dropout and Pooling, and show that they bias the network toward recovering such an optimal representation. Finally, (3) we show that minimizing a suitably (implicitly or explicitly) regularized loss with SGD with respect to the weights of the network implies implicit optimization of the loss described in (1), with relates instead to the activations of the network. Therefore, even when we optimize a DNN as a black-box classifier, we are always biased toward learning minimal, sufficient and invariant representation. The link between (implicit or explicit) regularization of the classification loss and learning of optimal representations is specific to the architecture of deep networks, and is not found in a general classifier. The theory is related to a new version of the Information Bottleneck that studies the weights of a network, rater than the activation, and can also be derived using PAC-Bayes or Kolmogorov complexity arguments, providing independent validation.
March 19	No Seminar (Finals Week)

PhD Students win Best Poster Awards

Standard

Congratulations to CML graduate students for recent poster awards at the 2017 Southern California Machine Learning Symposium held at USC. Zhengli Zhao and Dheeru Dua (with advisor Sameer Singh) won best poster award for their work on generating natural adversarial examples and Eric Nalisnick (with advisor Padhraic Smyth) won honorable mention for his work on boosting variational inference. There were about 50 student posters presented and over 250 machine learning researchers attended the event. Next SoCal ML Symposium is scheduled for Fall 2018, to be hosted by UCLA.

Center for Machine Learning and Intelligent Systems

Bren School of Information and Computer Science

University of California, Irvine

Center for Machine Learning and Intelligent Systems

University of California, Irvine

UCI group develops deep learning approach for Rubik’s cube

Fall 2019

Spring 2019

Faculty Positions at UC Irvine

Fall 2018

Two new NSF awards in Machine Learning for Sameer Singh

Spring 2018

Workshop for the Philosophy of Machine Learning

Winter 2018

PhD Students win Best Poster Awards