Fall 2013

Standard

Sept 26 Bren Hall 4011 1 pm	Vittorio Ferrari Reader Department of Informatics University of Edinburgh Searching for objects driven by context The dominant visual search paradigm for object class detection is sliding windows. Although simple and effective, it is also wasteful, unnatural and rigidly hardwired. We propose strategies to search for objects which intelligently explore the space of windows by making sequential observations at locations decided based on previous observations. Our strategies adapt to the class being searched and to the content of a particular test image, exploiting context as the statistical relation between the appearance of a window and its location relative to the object, as observed in the training set. In addition to being more elegant than sliding windows, we demonstrate experimentally on the PASCAL VOC 2010 dataset that our strategies evaluate two orders of magnitude fewer windows while achieving higher object detection performance.
Oct 7 Bren Hall 4011 1 pm	Babak Shahbaba Assistant Professor Department of Statistics University of California, Irvine Towards Scalable Bayesian Inference Massive datasets have imposed new challenges for the scientific community. Data-intensive problems are especially challenging for Bayesian methods, which typically involve intractable models that rely on Markov Chain Monte Carlo (MCMC) algorithms for their implementation. In this talk, I will discuss our recent attempts to develop a new class of scalable computational methods to facilitate the application of Bayesian statistics in data-intensive scientific problems. One approach uses geometrically motivated methods that explore the parameter space more efficiency by exploiting its geometric properties. Another approach uses techniques that are designed to speed up sampling algorithms through faster exploration of the parameter space. I will also discuss a possible integration of geometric methods with proper computational techniques to improve the overall efficiency of sampling algorithms so that they can be used for Big Data analysis.
Oct 14 Bren Hall 4011 1 pm	James Foulds PhD Candidate Department of Computer Science University of California, Irvine Modeling Scientific Impact with Topical Influence Regression When reviewing scientiﬁc literature, it would be useful to have automatic tools that identify the most inﬂuential scientiﬁc articles as well as how ideas propagate between articles. In this context, this paper introduces topical inﬂuence, a quantitative measure of the extent to which an article tends to spread its topics to the articles that cite it. Given the text of the articles and their citation graph, we show how to learn a probabilistic model to recover both the degree of topical inﬂuence of each article and the inﬂuence relationships between articles. Experimental results on corpora from two well-known computer science conferences are used to illustrate and validate the proposed approach.
Oct 21 Bren Hall 4011 1 pm	Mohammad Azar Postdoctoral Fellow School of Computer Science Carnegie Mellon University A Spectral Approach to Sequential Transfer in Multi-armed Bandit with Finite Set of Models Learning from prior tasks and transferring that experience to improve future performance is critical for building lifelong learning agents. Although results in supervised and reinforcement learning show that transfer may significantly improve the learning performance, most of the literature on transfer is focused on batch learning tasks. In this paper we study the problem of \textit{sequential transfer in online learning}, notably in the multi-armed bandit framework, where the objective is to minimize the cumulative regret over a sequence of tasks by incrementally transferring knowledge from prior tasks. We introduce a novel bandit algorithm based on a method-of-moments approach for the estimation of the possible tasks and derive regret bounds for it. Bio: Mohammad Gheshlaghi Azar studied Electrical Engineering (control theory) at University of Tehran, Iran from 2003 till 2006. He then moved to Netherlands for Ph.d., where he worked with Professor Bert Kappen and Professor Remi Munos on the subject of statistical machine learning and reinforcement learning. Following finishing his Ph.D. in 2012, he joined the school of computer science at Carnegie Mellon university as a postdoctoral fellow, where he is working with Professor Brunskill on the subject of transfer of knowledge in sequential decision making problems. His research is focused on developing new machine learning algorithms which apply to life-long and real-world learning and decision making problems.
Oct 28 Bren Hall 4011 1 pm	Brian Milch Software Engineer Google (Los Angeles) From Text to Concepts at Google This talk will describe Rephil, a system used widely within Google to identify the concepts or topics that underlie a given piece of text. Rephil determines, for example, that “apple pie” relates to some of the same concepts as “chocolate cake”, but has little in common with “apple ipod”. The concepts used by Rephil are not pre-specified; instead, they are derived by an unsupervised learning algorithm running on massive amounts of text. The result of this learning process is a Rephil model — a giant Bayesian network with concepts as nodes. I will discuss the structure of Rephil models, the distributed machine learning algorithm that we use to build these models from terabytes of data, and the Bayesian network inference algorithm that we use to identify concepts in new texts under tight time constraints. I will also discuss how Rephil relates to ongoing academic research on probabilistic topic models. Bio: Brian Milch is a software engineer at Google’s Los Angeles office. He first joined Google in 2000, after completing a B.S. in Symbolic Systems at Stanford University. A year later, he entered the Computer Science Ph.D. program at U.C. Berkeley. He received his doctorate in 2006, with a thesis focused on the integration of probabilistic and logical approaches to artificial intelligence. He then spent two years as a post-doctoral researcher at MIT before returning to Google in 2008. He has contributed to Google production systems for spelling correction, transliteration, and semantic modeling of text.
Nov 4 Bren Hall 4011 1 pm	Yifei Chen PhD Candidate Department of Computer Science University of California, Irvine A gradient boosting algorithm for survival analysis via direct optimization of concordance index Survival analysis focuses on modeling and predicting the time to an event of interest. Traditional survival models (e.g., the prevalent proportional hazards model) often impose strong assumptions on hazard functions, which describe how the risk of an event changes over time depending on covariates associated with each individual. In this paper we propose a nonparametric survival model (GBMCI) that does not make explicit assumptions on hazard functions. Our model trains an ensemble of regression trees by the gradient boosting machine to optimize a smoothed approximation of the concordance index, which is one of the most widely used metrics in survival model evaluation. We benchmarked the performance of GBMCI against other popular survival models with a large-scale breast cancer prognosis dataset. Our experiment shows that GBMCI consistently outperforms other methods based on a number of covariate settings.
Nov 11	Veterans Day (no seminar)
Nov 18 Bren Hall 4011 1 pm	Shiwei Lan PhD Candidate Department of Statistics University of California, Irvine Spherical Hamiltonian Monte Carlo for Constrained Target Distributions Statistical models with constrained probability distributions are abundant in machine learning. Some examples include regression models with norm constraints (e.g., Lasso), probit models, many copula models, and Latent Dirichlet Allocation (LDA) models. Bayesian inference involving probability distributions confined to constrained domains could be quite challenging for commonly used sampling algorithms. For such problems, we propose a novel Markov Chain Monte Carlo (MCMC) method that provides a general and computationally efficient framework for handling boundary conditions. Our method first maps the $D$-dimensional constrained domain of parameters to the unit ball ${\bf B}_0^D(1)$, then augments it to the $D$-dimensional sphere ${\bf S}^D$ such that the original boundary corresponds to the equator of ${\bf S}^D$. This way, our method handles the constraints implicitly by moving freely on sphere generating proposals that remain within boundaries when mapped back to the original space. To improve the computational efficiency of our algorithm, we divide the dynamics into several parts such that the resulting split dynamics has a partial analytical solution as a geodesic flow on the sphere. We apply our method to several examples including truncated Gaussian, Bayesian Lasso, Bayesian bridge regression, and a copula model for identifying synchrony among multiple neurons. Our results show that the proposed method can provide a natural and efficient framework for handling several types of constraints on target distributions.
Nov 25	Thanksgiving week (no seminar)
Dec 2 Bren Hall 4011 1 pm	Rosario Cammarota System Security Architect Qualcomm Research Rescheduled

Spring 2013

Standard

Blank post…

Spring 2013

Standard

March 11 Bren Hall 4011 1 pm	Furong Huang Graduate Student Department of Electrical Engineering and Computer Science University of California, Irvine Learning Mixtures of Tree Graphical Models We consider unsupervised estimation of mixtures of discrete graphical models, where the class variable is hidden and each mixture component can have a potentially different Markov graph structure and parameters over the observed variables. We propose a novel method for estimating the mixture components with provable guarantees. Our output is a tree-mixture model which serves as a good approximation to the underlying graphical model mixture. The sample and computational requirements for our method scale as $\poly(p, r)$, for an $r$-component mixture of $p$-variate graphical models, for a wide class of models which includes tree mixtures and mixtures over bounded degree graphs.
March 18 Bren Hall 4011 1 pm	John Turner Assistant Professor Operations and Decision Technologies, The Paul Merage School of Business University of California, Irvine Planning and Scheduling of Guaranteed Delivery Advertising Over the past decade, improvements in information technology have led to the development of new media and new forms of advertising. One example is dynamic in-game advertising, in which ads served over the Internet are seamlessly integrated into the 3D environments of video games played on consoles like the XBox 360. We begin by introducing a plan-track-revise approach for an in-game ad scheduling problem posed by Massive Inc., a pioneer in dynamic in-game advertising that is now part of Microsoft. Using 26 weeks of historical data from Massive, we compare our algorithm’s ad slotting performance with Massive’s legacy algorithm over a rolling horizon, and find that we reduce make-good costs by 80-87%, reserve more premium ad slots for future sales, increase the number of unique individuals that see each ad campaign, and deliver ads in a smoother, more consistent fashion over time. Next, we build on our real-world experience and formulate a single-period ad planning problem which emphasizes the core structure of how ads should be planned in a broad class of new media. We develop two efficient algorithms which intelligently aggregate the high-dimensional audience space that results when ad campaigns target very specific cross-sections of the overall population, and use duality theory to show that when the audience space is aggregated using our procedure, near-optimal schedules can be produced despite significant aggregation. Optimality in this case is with respect to a quadratic objective chosen for tractability, however, by explicitly modeling the stochastic nature of viewers seeing ads and the low-level ad slotting heuristic of the ad server, we derive sufficient conditions that tell us when our solution is also optimal with respect to two important practical objectives: minimizing the variance of the number of impressions served, and maximizing the number of unique individuals that are shown each ad campaign.
April 1 Bren Hall 4011 1 pm	Bart Knijnenburg Graduate Student Department of Informatics University of California, Irvine ML Opportunities in Privacy Research Abstract: Personalized systems often require a relevant amount of personal information to properly learn the preferences of the user. However, privacy surveys demonstrate that Internet users want to limit the collection and dissemination of their personal data. In response, systems may give users additional control over their information disclosure. But privacy-decisions are inherently difficult: they have delayed and uncertain repercussions that are difficult to trade-off with the possible immediate gratification of disclosure. Can we help users to balance the benefits and risks of information disclosure in a user-friendly manner, so that they can make good privacy decisions? My idea is to develop a Privacy Adaptation Procedure that offers tailored privacy decision support. This procedure gives users personalized “nudges” and personalized “justifications” based on a context-aware prediction of their privacy preferences. In this talk I will present two pieces of research that each take a step towards this Privacy Adaptation Procedure. I then hope to start a discussion with the audience on how to proceed with this endeavor. Bio: Bart Knijnenburg is a Ph.D candidate in Informatics at the University of California, Irvine. His work focuses on privacy decision-making and recommender systems. He received his B.S. degree in Innovation Sciences and his M.S. degree in Human-Technology Interaction from Eindhoven University of Technology, The Netherlands, and his M.A. degree in Human- Computer Interaction from Carnegie Mellon University. Bart is a leading advocate of user-experience research in recommender systems, and studies the (ir)regularities of privacy decision making. His academic work lives at http://www.usabart.nl.
April 8 Bren Hall 4011 1 pm	Qiang Liu Graduate Student Department of Computer Science University of California, Irvine Belief Propagation for Crowdsourcing Crowdsourcing on platforms like Amazon’s Mechanical Turk have become a popular paradigm for labeling large datasets. However, it has given rise to the computational task of properly aggregating the crowdsourced labels provided by a collection of unreliable and diverse annotators. On the other side, graphical models are powerful tools for reasoning on systems with complicated dependency structures. In this talk, we approach thecrowdsourcing problem by transforming it into a standard inference problem in graphical models, and apply powerful inference algorithms such as belief propagation (BP). We show both the naive majority voting and a recent algorithm by Karger, Oh, and Shah are special cases of our BP algorithm under particular modeling choices. With more careful choices, we show that our simple BP performs surprisingly well on both simulated and real-world datasets, competitive with state-of-the-art algorithms based on more complicated modeling assumptions. Our work sheds light on the important tradeoff between better modeling choices and better inference algorithms.
April 15 Bren Hall 6011 1 pm	William Noble Professor Department of Genome Sciences/Department of Computer Science and Engineering University of Washington The one-dimensional and three-dimensional structure of the genome Abstract: A variety of molecular biology technologies have recently made it clear that the function of the genome in vivo is determined both by the linear sequences of nucleotides along the chromosome and the three-dimensional conformation of chromosomes within the nucleus. In this talk, I will describe computational and statistical methods that we have developed and applied to a variety of genomes, with the goal of characterizing genome architecture and function. In particular, we have used unsupervised and semisupervised machine learning methods to infer the linear state structure of the genome, as defined by a large panel of epigenetic data sets generated by the NIH ENCODE Consortium, and we have developed methods to assign statistical confidence and infer the 3D structure of genomes from Hi-C data. About the Speaker: Dr. William Stafford Noble is Professor in the Department of Genome Sciences in the School of Medicine at the University of Washington where he has a joint appointment in the Department of Computer Science and Engineering in the College of Engineering. Previously he was a Sloan/DOE Postdoctoral Fellow with David Haussler at the University of California, Santa Cruz before he became an Assistant Professor in the Department of Computer Science at Columbia University. He graduated from Stanford University in 1991 with a degree in Symbolic Systems before receiving a Ph.D in computer science and cognitive science from UC San Diego in 1998. His research group develops and applies statistical and machine learning techniques for modeling and understanding biological processes at the molecular level. Noble is the recipient of an NSF CAREER award and is a Sloan Research Fellow.
April 22 Bren Hall 4011 1 pm	Pierre Baldi Chancellor’s Professor Department of Computer Science University of California, Irvine The Dropout Learning Algorithm Dropout is a new learning algorithm recently introduced by Hinton and his group. As stated in their abstract: “When a large feedforward neural network is trained on a small training set, it typically performs poorly on held-out test data. This overfitting is greatly reduced by randomly omitting half of the feature detectors on each training case. This prevents complex co-adaptations in which a feature detector is only helpful in the context of several other specific feature detectors. Instead, each neuron learns to detect a feature that is generally helpful for producing the correct answer given the combinatorially large variety of internal contexts in which it must operate. Random gdropouth gives big improvements on many benchmark tasks and sets new records for speech and object recognition.” This seminar will present a mathematical analysis of the dropout algorithm and its intriguing properties.
April 29 Bren Hall 4011 1 pm	Shimon Whiteson Assistant Professor Informatics Institute University of Amsterdam Multi-Objective Decision Making in Collaborative Multi-Agent Systems In collaborative multi-agent systems, teams of agents must coordinate their behavior in order to maximize their common utility. Such systems are useful, not only for addressing tasks that are inherently distributed, but also for decomposing tasks that would otherwise be too complex to solve. Unfortunately, computing coordinated behavior is computationally expensive because the number of possible joint actions grows exponentially in the number of agents. Consequently, exploiting loose couplings between agents, as expressed in graphical models, is key to rendering such decision making efficient. However, existing methods for solving such models assume there is only a single objective. In contrast, many real-world problems are characterized by the presence of multiple objectives to which the solution is not a single action but the set of actions optimal for all trade-offs between the objectives. In this talk, I will propose a new method for multi-objective multi-agent graphical games that prunes away dominated solutions. I will also discuss the theoretical support for this method and present an empirical study that shows that it can tackle multi-objective problems much faster than alternatives that do not exploit loose couplings.
May 6 Bren Hall 4011 1 pm	Anoop Korattikara Graduate Student Department of Computer Science University of California, Irvine Markov Chain Monte Carlo and the Bias-Variance Tradeoff Bayesian posterior sampling can be painfully slow on very large datasets, since traditional MCMC methods such as Hybrid Monte Carlo are designed to be asymptotically unbiased and require processing the entire dataset to generate each sample. Thus, given a small amount of sampling time, the variance of estimates computed using such methods could be prohibitive. We argue that lower risk estimates can often be obtained using gapproximateh MCMC methods that mix very fast (and thus lower the variance quickly) at the expense of a small bias in the stationary distribution. I will first talk about two such biased algorithms: Stochastic Gradient Langevin Dynamics and its successor Stochastic Gradient Fisher Scoring, both of which use stochastic gradients estimated from mini-batches of data, allowing them to mix very fast. Then I will present our current work on a new (biased) MCMC algorithm that uses a sequential hypothesis test to approximate the Metropolis-Hastings test, allowing us to accept/reject samples with high confidence using only a fraction of the data required for the exact test.
May 13 Bren Hall 4011 1 pm	Katerina Fragkiadaki Graduate Student Department of Computer Science University of Pennsylvania Multi-Granularity Steering for Human Actions: Motion, Pose and Intention Tracking people and their body pose in videos is a central problem in computer vision. Standard tracking representations typically reason about temporal coherence of detected bodies and parts. They have difficulty tracking people under partial occlusions or wild body deformations, where people and body pose detectors are often inaccurate, due to the small number of training examples in comparison to the exponential variability of such configurations. In this talk, I will present novel tracking representations that allow to track people and their body pose by exploiting information at multiple granularities when available, whole body, parts or pixel-wise motion correspondences and their segmentations. A key challenge is resolving contradictions among different information granularities, such as detections and motion estimates in the case of false alarm detections or leaking motion affinities. I will introduce graph steering, a framework that specifically targets inference under potentially sparse unary detection potentials and dense pairwise motion affinities – a particular characteristic of the video signal – in contrast to standard MRFs. We will present three instances of steering. First, we study people detection and tracking under persistent occlusions. I will demonstrate how to steer dense optical flow trajectory affinities with repulsions from sparse confident detections to reach a global consensus of detection and tracking in crowded scenes. Second, we study human motion and pose estimation. We segment hard to detect, fast moving body limbs from their surrounding clutter and match them against pose exemplars to detect body pose and improve body part motion estimates with kinematic constraints. Finally, I will show how we can learn certainty of detections under various pose and motion specific contexts, and use such certainty during steering for jointly inferring multi-frame body pose and video segmentation. We show empirically that such multi-granularity tracking representation is worthwhile, obtaining significantly more accurate body and pose tracking in popular datasets. Bio: Katerina Fragkiadaki is a Ph.D. student in Computer and Information Science in the University of Pennsylvania. She received her diplomat in Computer Engineering from the National Technical University of Athens. She works on tracking, segmentation and pose estimation of people under close interactions, for understanding their actions and intentions. She also works on segmenting and tracking cell populations for understanding and modeling cell behavior.
May 20 Bren Hall 4011 1 pm	Dennis Park Graduate Student Department of Computer Science University of California, Irvine Multiple Solutions via M-Best MAP This talk will serve two purposes. In the first part, I will provide a tutorial motivating and introducing M-best algorithms particularly for those who are new to these approaches. As other intelligent systems, applications in computer vision heavily rely on MAP hypotheses of probabilistic models. However, predicting a single (most probable) hypothesis is often suboptimal when training data is noisy or underlying model is complex. As an alternative, various M-best algorithms have been introduced mainly in speech recognition community. By walking through a simple example using two M-best algorithms, Nilsson’98 and Yanover & Weiss’03, the audience will gain insights into the algorithms and their application to various graphical models. In the second part, I will talk about a more recent work on applications of M-best algorithm to computer vision problems. The main hurdle for a direct application of traditional M-best algorithms to computer vision applications is a lack of diversity : the second best hypothesis is only one-pixel off from the best one. To overcome this limitation, we developed a novel M-best algorithm which incorporates non-maximal suppression into Yanover & Weiss’s algorithm. When applied to a model for pose estimation of human body, the algorithm produces diverse and high-scoring poses which are re-evaluated using tracking models for videos, achieving more accurate tracks of human poses.
May 27 (no seminar)	Memorial Day
June 3 Bren Hall 4011 1 pm	Kamalika Chaudhuri Assistant Professor Department of Computer Science and Engineering University of California, San Diego Challenges in Differentially-Private Data Analysis Machine learning algorithms increasingly work with sensitive information on individuals, and hence the problem of privacy-preserving data analysis — how to design data analysis algorithms that operate on the sensitive data of individuals while still guaranteeing the privacy of individuals in the data– has achieved great practical importance. In this talk, we address two problems in differentially private data analysis. First, we address the problem of privacy-preserving classification, and present an efficient classifier which is private in the differential privacy model of Dwork et al. Our classifier works in the ERM (empirical loss minimization) framework, and includes privacy preserving logistic regression and privacy preserving support vector machines. We show that our classifier is private, provide analytical bounds on the sample requirement of our classifier, and evaluate it on real data. We next address the question of differentially private statistical estimation. We draw a concrete connection between differential privacy, and gross error sensitivity, a measure of robustness of a statistical estimator, and show how these two notions are quantitatively related. Based on joint work with Claire Monteleoni (George Washington University), Anand Sarwate (TTI Chicago), and Daniel Hsu (Microsoft Research). Bio: Kamalika Chaudhuri received a Bachelor of Technology degree in Computer Science and Engineering in 2002 from the Indian Institute of Technology, Kanpur, and a PhD in Computer Science from UC Berkeley in 2007. After a stint as a postdoctoral researcher at the Information Theory and Applications Center at UC San Diego, she joined the CSE department at UCSD as an assistant professor in 2010. Kamalika’s research is on the design and analysis of machine-learning algorithms and their applications. In particular, her interests lie in clustering, online learning, and privacy-preserving machine-learning, and applications of machine-learning and algorithms to practical problems in other areas.
June 7 Bren Hall 3011 11 am	Maja Matarić Professor and Chan Soon-Shiong Chair Department of Computer Science/Neuroscience/Pediatrics University of Southern California Human-Machine Interaction Methods for Socially Assistive Robotics Socially assistive robotics (SAR) is a new field of intelligent robotics that focuses on developing machines capable of assisting users through social rather than physical interaction. The robot’s physical embodiment is at the heart of SAR’s effectiveness, as it leverages the inherently human tendency to engage with lifelike (but not necessarily human-like or otherwise biomimetic) social behavior. People readily ascribe intention, personality, and emotion to robots; SAR leverages this engagement stemming from non-contact social interaction involving speech, gesture, movement demonstration and imitation, and encouragement, to develop robots capable of monitoring, motivating, and sustaining user activities and improving human learning, training, performance and health outcomes. Human-robot interaction (HRI) for SAR is a growing multifaceted research area at the intersection of engineering, health sciences, neuroscience, social, and cognitive sciences. This talk will describe our research into embodiment, modeling and steering social dynamics, and long-term user adaptation for SAR. The research will be grounded in projects involving analysis of multi-modal activity data, modeling personality and engagement, formalizing social use of space and non-verbal communication, and personalizing the interaction with the user over a period of months, among others. The presented methods and algorithms will be validated on implemented SAR systems evaluated byhuman subject cohorts from a variety of user populations, including stroke patients, children with autism spectrum disorder, and elderly with Alzheimers and other forms of dementia. Bio: Maja Mataric is professor and Chan Soon-Shiong chair in Computer Science, Neuroscience, and Pediatrics at the University of Southern California, founding director of the USC Center for Robotics and Embedded Systems (cres.usc.edu), co-director of the USC Robotics Research Lab (robotics.usc.edu) and Vice Dean for Research in the USC Viterbi School of Engineering. She received her PhD in Computer Science and Artificial Intelligence from MIT in 1994, MS in Computer Science from MIT in 1990, and BS in Computer Science from the University of Kansas in 1987. She is a Fellow of the American Association for the Advancement of Science (AAAS), Fellow of the IEEE, and recipient of the Presidential Awards for Excellence in Science, Mathematics & Engineering Mentoring (PAESMEM), the Anita Borg Institute Women of Vision Award for Innovation, Okawa Foundation Award, NSF Career Award, the MIT TR100 Innovation Award, and the IEEE Robotics and Automation Society Early Career Award. She served as the elected president of the USC faculty and the Academic Senate. At USC she has been awarded the Viterbi School of Engineering Service Award and Junior Research Award, the Provost’s Center for Interdisciplinary Research Fellowship, the Mellon Mentoring Award, the Academic Senate Distinguished Faculty Service Award, and a Remarkable Woman Award. Her research is currently developing robot-assisted therapies for children with autism spectrum disorders, stroke and traumatic brain injury survivors, and individuals with Alzheimer’s Disease and other forms of dementia. Details about her research are found at http://robotics.usc.edu/interaction/.
June 10 Bren Hall 4011 1 pm	Mark Stalzer Executive Director Center for Advanced Computing Research California Institute of Technology Trends in Scientific Discovery Engines and Applications This talk is about trends in computing technology that are leading to exascale-class systems for both scientific simulations and data reduction. The underlying themes are power consumption, the massive increase in concurrency, and architectural balance for “Big Data” systems. Applications that require balance are presented in astronomy, high-energy physics, and engineering. Optimal uncertainty quantification is shown as a way to rigorously connect simulations with Big Data.
June 14 Bren Hall 4011 2 pm	Nima Dokoohaki Postdoctoral Research Assistant Department of Software and Computer Systems Royal Institute of Technology (KTH) Trust-Aware User Profile Mining, Recommendation and Retrieval We have introduced the notion of augmenting user profiling process with trust, as a solution to the problem of uncertainty and unmanageable exposure of personal data during access, mining and retrieval by web applications. Our solution suggests explicit modeling of trust and embedding trust metrics and mechanisms within very fabric of user profiles. This has in turn allowed information systems to consume and understand this extra knowledge in order to improve interaction and collaboration among individuals and system. When formalizing such profiles, another challenge is to realize increasingly important notion of privacy preferences of users. The profiles are designed in a way to incorporate preferences of users allowing target systems to understand privacy concerns of users during their interaction. Highlighted results start from modeling of adaptive user profiles incorporating users taste, trust and privacy preferences. This in turn led to proposal of several ontologies for user and content characteristics modeling for improving indexing and retrieval of user content and profiles across the platform. Sparsity and uncertainty of profiles were studied through frameworks of data mining and machine learning of profile data taken from on-line social networks. Results of mining and population of data from social networks along with profile data increased the accuracy of intelligent suggestions made by system to improving navigation of users in on-line and off-line museum interfaces. These results were highlighted mainly under the context of EU FP7 Smartmuseum project. We also introduced several trust-based recommendation techniques and frameworks capable of mining implicit and explicit trust across ratings networks taken from social and opinion web. Resulting recommendation algorithms have shown to increase accuracy of profiles, through incorporation of knowledge of items and users and diffusing them along the trust networks. At the same time focusing on automated distributed management of profiles, we showed that coverage of system can be increased effectively, surpassing comparable state of art techniques. We have clearly shown that trust clearly increases accuracy of suggestions predicted by system. To assure overall privacy of such value-laden systems, privacy was given a direct focus when architectures and metrics were proposed and shown that a joint optimal setting for accuracy and perturbation techniques can maintain accurate output. Finally, focusing on hybrid models of web data and recommendations motivated us to study impact of trust in the context of topic-driven recommendation in social and opinion media, which in turn helped us to show that leveraging content-driven and tie-strength networks can improve systems accuracy for several important web computing tasks. Speaker Bio: Nima Dokoohaki holds a MSc (2007) in software engineering of distributed systems, and a PhD (2013) in information and communication technologies from KTH-Royal Institute of Technology, Sweden. He is currently a postdoctoral research assistant at software and computer systems (SCS) lab at KTH, where he focuses on big data and social informatics, particularly his research interests include trust, social network mining and analysis and recommender systems.

Winter 2013

Standard

January 7 Bren Hall 4011 1 pm	Majid Janzamin Graduate Student Department of Electrical Engineering and Computer Science University of California, Irvine High-Dimensional Covariance Decomposition into Sparse Markov and Independence Domains Fitting high-dimensional data involves a delicate tradeoff between faithful representation and the use of sparse models. Too often, sparsity assumptions on the fitted model are too restrictive to provide a faithful representation of the observed data. In this talk, we present a novel framework incorporating sparsity in different domains. We decompose the observed covariance matrix into a sparse Gaussian Markov model (with a sparse precision matrix) and a sparse independence model (with a sparse covariance matrix). Our framework incorporates sparse covariance and sparse precision estimations as special cases and thus introduces a richer class of high-dimensional models. We characterize sufficient conditions for identifiability of the two models, \viz Markov and independence models. We propose an efficient decomposition method based on a modification of the popular $\ell_1$-penalized maximum-likelihood estimator ($\ell_1$-MLE). We establish that our estimator is consistent in both the domains, i.e., it successfully recovers the supports of both Markov and independence models, when the number of samples $n$ scales as $n = \Omega(d^2 \log p)$, where $p$ is the number of variables and $d$ is the maximum node degree in the Markov model. Our experiments validate these results and also demonstrate that our models have better inference accuracy under simple algorithms such as loopy belief propagation.
January 14 Bren Hall 4011 1 pm	Christian Shelton Associate Professor Department of Computer Science and Engineering University of California, Riverside Machine Learning for Critical Care Medicine Medicine is becoming a “big data” discipline. In many ways, it shares more in common with engineering and business than with lab sciences: while controlled experiments can be performed, most data are available from live practice with the aim of solving a problem, not exploration of hypotheses. This this talk I will discuss my work in collaboration with Children’s Hospital Los Angeles in applying machine learning to improve health care, particularly pediatric intensive care. I will use two current projects to drive the discussion: (1) monitoring of blood CO2 and pH levels for patients on mechanical ventilation and (2) predicting acute kidney injury and identifying potential causes. I will describe the data collection, how the data do and do not fit into machine learning assumptions, and the current state and trends in medical data. Both problems have been tackled with a variety of methods and I will summarize our findings and lessons in applying machine learning to medical data. Bio: Christian Shelton is an Associate Professor of Computer Science and Engineering at the University of California at Riverside. His research is in machine learning with a particular interest in dynamic processes. He has worked on applications as varied as computer vision, sociology, game theory, decision theory, and computational biology. He has been a faculty member at UC Riverside since 2003. He received his PhD from MIT in 2001 and his bachelor degree from Stanford in 1996.
January 21 (no seminar)	Martin Luther King, Jr. Day
January 28 Bren Hall 4011 1 pm	Ragupathyraj Valluvan Graduate Student Department of Electrical Engineering and Computer Science University of California, Irvine Predicting and Interpreting Dynamic Social Interactions via Conditional Latent Random Fields We consider the problem of predicting and interpreting dynamic social interactions among a time-varying set of participants. We model the interactions via a dynamic social network with joint edge and vertex dynamics. It is natural to expect that the accuracy of vertex prediction (i.e. whether an actor participates or not at a given time) strongly affects the ability to predict dynamic network evolution accurately. A conditional latent random field (CLRF) model is employed here to model the joint vertex evolution. This model family can incorporate dependence in vertex co-presence, found in many social settings (e.g., subgroup structure, selective pairing). Moreover, it can incorporate the effect of covariates (e.g. seasonality). We introduce a novel approach for fitting such CLRF models which leverages on the recent results for learning latent tree models and combines it with a parametric model for covariate effects and a logistic model for edge prediction (i.e. social interactions) given the vertex predictions. We apply this approach to both synthetic data and a classic social network data set involving interactions among windsurfers on a Southern California beach. Experiments conducted show the potential to discover hidden social relationship structures and a significant improvement in prediction accuracy of the vertex and edge set evolution (about 45% for conditional vertex participation accuracy and 122% for overall edge prediction accuracy) over the baseline dynamic network regression approach.
February 4 Bren Hall 4011 1 pm	Matthias Blume Senior Director of Analytics CoreLogic Entity Disambiguation Entity disambiguation (a.k.a. Entity Resolution, Record Linking, People Search, Customer Pinning, Merge/Purge, …) determines which data records correspond to distinct entities (persons, companies, locations, etc.) when IDs such as SSN are not available. Matthias will present an overview of the field and a technique that can utilize any available attributes including co-occurring entities, relations, and topics from unstructured text. It automatically learns the information value of each feature from the data. By using a greedy merge approach and some tricks to avoid unnecessary match operations, it is fast. Finally, we will explore possible vector space and graph representations of the problem, alternative approaches that have been tried, and suggest future work based on reinforcement learning and active learning. Bio Matthias Blume is Senior Director of Analytics at CoreLogic, the nation’s largest real estate data provider. His team develops solutions for mortgage fraud detection, consumer credit scoring, automated valuation models, and more. Previously, he worked in marketing optimization, text analytics, and the gamut of financial services analytics at Redlign, Covario, and HNC/FICO. He received his PhD in Electrical and Computer Engineering from UCSD and a BS from Caltech.
February 11 Bren Hall 4011 1 pm	Eric Bax Director, Marketplace Design Yahoo! Validating Network Classifiers with Cohorts Networks play important roles in our lives, from protein activation networks that determine how our bodies develop to social networks and networks for transportation and power transmission. Networks are interesting for machine learning because they grow in interesting ways. A person joins a social network because their friend is already in it. A patient joins a network of disease infection because they are in contact with someone who has been infected. A new bridge is built because there are major transportation facilities on both sides of a body of water. Networks form iteratively; each new cohort of nodes depends on nodes already present. This talk discusses a way to apply machine learning methods to network classifiers for networks that grow by adding cohorts.
February 18 (no seminar)	Presidents’ Day
February 25 Bren Hall 4011 1 pm	Kristina Lerman Project Leader University of Southern California/Information Sciences Institute Social Dynamics of Information It is widely believed that information spreads through a social network much like a virus, with “infected” individuals transmitting it to their friends, enabling information to reach many people. However, our studies of social media indicate that most information epidemics fail to reach viral proportions. We show that psychological factors fundamentally distinguish social contagion from viral contagion. Specifically, people have finite attention, which they divide over all incoming stimuli. This makes highly connected people less “susceptible” to infection and stops information spread. In the second part of the talk I explore the connection between dynamics and network structure. I show that to find interesting structure, network analysis has to consider not only network’s links, but also dynamics of information flow. I introduce dynamics-aware network analysis methods and demonstrate that they can identify more meaningful structures in social media networks than popular alternatives.

Fall 2012

Standard

October 1 Bren Hall 4011 1 pm	Mohsen Hejrati Graduate Student Department of Computer Science University of California, Irvine Analyzing 3D Objects in Cluttered Images We present an approach to detecting and analyzing the 3D configuration of objects in real-world images with heavy occlusion and clutter. We focus on the application of finding and analyzing cars. We do so with a two-layer model; the first layer reasons about 2D appearance changes due to within-class variation and viewpoint. Rather than using a global view-based model, we describe a compositional representation that models a large number of effective views using a small number of local view-based templates. We use this model to propose candidate detections, which are then refined by our second layer, a 3D statistical model that reasons about 3D shape changes and 3D camera viewpoints. We demonstrate state-of-the-art accuracy on challenging images from the PASCAL VOC 2011 dataset.
October 8 Bren Hall 4011 1 pm	Sergey Kirshner Assistant Professor Department of Statistics Purdue University Copulas in Machine Learning or How to Make Sense of Multi-Dimensional Non-Gaussian Real-Valued Data As number of application domains, including finance, hydrology, and astronomy, produce high-dimensional multivariate data, there is an increasing interest in models which can capture non-linear dependence between the observations. Enter copulas, a statistical approach which separates the marginal distributions for random variables from their dependence structure. I will go over the recent work on using copulas in two different settings. In the first setting, the graphical models are developed for copulas with the goal of modeling of non-Gaussian multivariate real-valued data. I will focus on tree-structured copulas in particular as they provide a convenient building block for such models and their applications to modeling of multi-site rainfall. The second setting, copulas are used to construct non-parametric robust estimators of dependence (e.g, information). Among applications of such estimators is a new robust approach to independent component analysis. Speaker Bio: Sergey Kirshner is an Assistant Professor of Statistics at Purdue University. Prior to joining Purdue, he was a postdoctoral fellow with Alberta Ingenuity Centre for Machine Learning at the Department of Computing Science at the University of Alberta. Before that, he was a graduate student and then a postdoc at the Donald Bren School of Information and Computer Sciences at the University of California, Irvine in Padhraic Smyth’s research group. His research interests lie in the area of statistical machine learning, more specifically, computational methods for learning and inference for sparse models of high-dimensional data, and their applications to scientific problems.
October 15 Bren Hall 4011 1 pm	Don Patterson Associate Professor Department of Informatics University of California, Irvine Gesture Recognition with Erlang-Cox Models To Identify Neurological Disorders in Premature Babies In this talk I will describe a system that leverages accelerometers to recognize a particular involuntary gesture in babies that have been born preterm. These gestures, known as cramped-synchronized general movements are highly correlated with a diagnosis of Cerebral Palsy. In order to test our system we recorded data from 10 babies admitted to the newborn intensive care unit at the UCI Medical Center. We demonstrate a Markov model based technique for recognizing gestures from accelerometers that explicitly represent duration. We do this by embedding an Erlang-Cox state transition model, which has been shown to accurately represent the first three moments of a general distribution, within a Dynamic Bayesian Network (DBN). The transition probabilities in the DBN can be learned via Expectation-Maximization or by using closed-form solutions. We show that by treating instantaneous machine learning classification values as observations and explicitly modeling duration, we improve the recognition of Cramped Syn- chronized General Movements, a motion highly correlated with an eventual diagnosis of Cerebral Palsy. Validated video observation annotations were utilized as ground truth. Finally, we conducted an analysis to understand the clinical impact of this technique.
October 22 Bren Hall 4011 1 pm	Levi Boyles Graduate Student Department of Computer Science University of California, Irvine The Time-Marginalized Coalescent Prior for Hierarchical Clustering We introduce a new prior for use in Nonparametric Bayesian Hierarchical Clustering. The prior is constructed by marginalizing out the time information of Kingman’s coalescent, providing a prior over tree structures which we call the Time-Marginalized Coalescent (TMC). This allows for models which factorize the tree structure and times, providing two benefits: more flexible priors may be constructed and more efficient Gibbs type inference can be used. We demonstrate this on an example model for density estimation and show the TMC achieves competitive experimental results.
October 29 Bren Hall 4011 1 pm	Pierre Baldi Chancellor’s Professor Department of Computer Science University of California, Irvine Deep Architectures and Deep Learning Deep architectures are important for machine learning, for engineering applications, and for understanding the brain. In this talk we will provide a brief historical overview of deep architectures from their 1950s origins to today. Motivated by this overview, we will study and prove several theorems regarding deep architectures and one of their main ingredients–autoencoder circuits–in particular in the unrestricted Boolean and unrestricted probabilistic cases. We will show how these analyses lead to a new general family of learning algorithms for deep architectures–the deep target (DT) algorithms. The DT approach converts the problem of learning a deep architecture into the problem of learning many shallow architectures by providing learning targets for the deep layers. Finally, we will present simulation results and applications of deep architectures and DT algorithms to protein structure prediction.
November 5 Bren Hall 4011 1 pm	Daniel Whiteson Associate Professor Department of Physics and Astronomy University of California, Irvine Searching for the Higgs Boson and Beyond with Machine Learning Tools High-energy physicists try to decompose matter into its most fundamental pieces by colliding particles at extreme energies. But to extract clues about the structure of matter from these collisions is not a trivial task, due to the incomplete data we can gather regarding the collisions, the subtlety of the signals we seek and the large rate and high dimensionality of the data. These challenges are not unique to high energy physics, and there is the potential for great progress in collaboration between high energy physicists and machine learning experts. I will describe the nature of the physics problem, the challenges we face in analyzing the data, the previous successes and failures of some ML techniques, and the open challenges.
November 12 (no seminar)	Veterans Day
November 16 Bren Hall 4011 1 pm	John Fisher Prinicipal Research Scientist CSAIL MIT Information Gathering Under Resource Constraints: Greed is Good In many distributed sensing problems, resource constraints preclude the utilization of all sensing assets. By way of example, inference in distributed sensor networks presents a fundamental trade-off between the utility in a distributed set of measurements versus the resources expended to acquire them, fuse them into a model of uncertainty, and then transmit the resulting model. Active approaches seek to manage sensing resources so as to maximize a utility function while incorporating constraints on resource expenditures. Such approaches are complicated by several factors. Firstly, the complexity of sensor planning is typically exponential in both the number of sensing actions and the planning time horizon. Consequently, optimal planning methods are intractable excepting for very small scale problems. Secondly, the choice of utility function may vary over time and across users. Approximate approaches (c.f. [Zhao et al., 2002, Kreucher et al., 2005]) have been proposed that treat a subset of these issues; however, the approaches are indirect and do not scale to large problems. In this presentation, I will discuss the use of information measures for resource allocation in distributed sensing systems. Such measures are appealing due to a variety of useful properties. For example, recent results of [Nguyen et al., 2009] link a class of information measures to surrogate risk functions and their associated bounds on excess risk [Bartlett et al., 2003]. Consequently, these measures are suitable proxies for a wide variety of risk functions. I will discuss a method [Williams et al., 2007a] which enables long time-horizon sensor planning in the context of state estimation with a distributed sensor network. The approach integrates the value of information discounted by resource expenditures over a rolling time horizon. Simulation results demonstrate that the resulting algorithm can provide similar estimation performance to that of greedy and myopic methods for a fraction of the resource expenditures. Furthermore, recently developed methods [Fisher III et al., 2009] have been shown to be useful for estimating these quantities in complex signal models. Finally, one consequence of this algorithmic development are new fundamental performance bounds for information gathering systems [Williams et al., 2007b] which show that, under mild assumptions, optimal (though intractable) planning schemes can yield no better than twice the performance of greedy methods for certain choices of information measures. The bound can be shown to be sharp. Additional on-line computable bounds, often tighter in practice, are presented as well. This is joint work with Georgios Papachristoudous, Jason L. Williams, & Michael Siracusa. Bio John Fisher is Principal Research Scientist at the MIT Computer Science and Artificial Intelligence Laboratory. His research focuses on information-theoretic approaches to machine learning, computer vision, and signal processing. Application areas include signal-level approaches to multi-modal data fusion, signal and image processing in sensor networks, distributed inference under resource constraints, resource management in sensor networks, and analysis of seismic and radar images. In collaboration with the Surgical Planning Lab at Brigham and Women’s Hospital, he is developing nonparametric approaches to image registration and functional imaging. He received a BS and MS in Electrical Engineering at the Univsersity of Florida in 1987 and 1989, respectively. He earned a PhD in Electrical and Computer Engineering in 1997.
November 19 Bren Hall 4011 1 pm	Lise Getoor Associate Professor Department of Computer Science University of Maryland, College Park Statistical Relational Learning and Graph Identification Within the machine learning community, there is a growing interest in learning structured models from input data that is itself structured, an area often referred to as statistical relational learning (SRL). I’ll begin with a brief overview of SRL, and discuss its relation to network analysis, extraction, and alignment. I’ll then describe our recent work on graph identification. Graph identification is the process of transforming an observed input network into an inferred output graph. It involves cleaning the data — inferring missing information and correcting mistakes – and is an important first step before any further network analysis is performed. It requires a combination of entity resolution, link prediction, and collective classification techniques. I will overview two approaches to graph identification: 1) coupled conditional classifiers (C^3), and 2) probabilistic soft logic (PSL). I will describe their mathematical foundations, learning and inference algorithms, and empirical evaluation, showing their power in terms of both accuracy and scalability. I will conclude by highlighting connections to privacy in social network data and other current big data challenges. Bio Lise Getoor is an Associate Professor in the Computer Science Department at the University of Maryland, College Park and University of Maryland Institute for Advanced Computer Studies. Her research areas include machine learning, and reasoning under uncertainty; in addition she works in data management, visual analytics and social network analysis. She is a board member of the International Machine Learning Society, a former Machine Learning Journal Action Editor, Associate Editor for the ACM Transactions of Knowledge Discovery from Data, JAIR Associate Editor, and she has served on the AAAI Council. She was conference co-chair for ICML 2011, and has served on the PC of many conferences including the senior PC for AAAI, ICML, KDD, UAI and the PC of SIGMOD, VLDB, and WWW. She is a recipient of an NSF Career Award and was awarded a National Physical Sciences Consortium Fellowship. Her work has been funded by ARO, DARPA, IARPA, Google, jIBM, LLNL, Microsoft, NGA, NSF, Yahoo! and others. She received her PhD from Stanford University, her Master’s degree from University of California, Berkeley, and her undergraduate degree from University of California, Santa Barbara.
November 26 Bren Hall 4011 1 pm	Shiwei Lan Graduate Student Department of Statistics University of California, Irvine Lagrangian Dynamical Monte Carlo Hamiltonian Monte Carlo (HMC) improves the computational efficiency of the Metropolis algorithm by reducing its random walk behavior. Riemannian Manifold HMC (RMHMC) further improves HMC’s performance by exploiting the geometric properties of the parameter space. However, the geometric integrator used for RMHMC involves implicit equations that require costly numerical analysis (e.g., fixed-point iteration). In some cases, the computational overhead for solving implicit equations undermines RMHMC’s benefits. To avoid this problem, we propose an explicit geometric integrator that replaces the momentum variable in RMHMC by velocity. We show that the resulting transformation is equivalent to transforming Riemannian Hamilton dynamics to Lagrangian dynamics. Experimental results show that our method improves RMHMC’s overall computational efficiency. All computer programs and data sets are available online (this http URL) in order to allow replications of the results reported in this paper. Link to arXiv: http://arxiv.org/abs/1211.3759
November 30 Bren Hall 4011 1 pm	Scott Sanner Senior Researcher Machine Learning Group NICTA Data Structures for Efficient Inference and Optimization in Expressive Continuous Domains To date, our ability to perform exact closed-form inference or optimization with continuous variables is largely limited to special well-behaved cases. This talk argues that with an appropriate representation and data structure, we can vastly expand the class of models for which we can perform exact, closed-form inference. This talk is in two parts. In the first part, I introduce an extension of the algebraic decision diagram (ADD) to continuous variables — termed the extended ADD (XADD) — to represent arbitrary piecewise functions over discrete and continuous variables and show how to efficiently compute elementary arithmetic operations, integrals, and maximization for these functions. In the second part, I briefly cover a wide range of novel applications where the XADD may be applied: (a) exact inference in expressive discrete and continuous variable graphical models, (b) factored, parameterized linear and quadratic optimization, (c) exact solutions to piecewise convex functions that enable a number of novel applications in machine learning, and (d) exact solutions to continuous state, action, and observation sequential decision-making problems — which includes closed-form exact solutions to previously unsolved problems in operations research. Acknowledgments: This is joint work with Zahra Zamani & Ehsan Abbasnejad (Australian National University), Karina Valdivia Delgado & Leliane Nunes de Barros (University of Sao Paulo), and Simon Fang (M.I.T.). Quick Speaker Bio: Scott Sanner is a Senior Researcher in the Machine Learning Group at NICTA Canberra and an Adjunct Fellow at the Australian National University, having joined both in 2007. Scott earned a PhD from the University of Toronto, an MS degree from Stanford, and a double BS degree from Carnegie Mellon. Scott’s research interests span decision-making applications ranging over AI, Machine Learning, and Information Retrieval. For more information, please visit: http://users.cecs.anu.edu.au/~ssanner/
December 3 Bren Hall 4011 1 pm	Francesco Bonchi Senior Research Scientist Yahoo! Research Barcelona Mining Progagation Data (in Social Networks) With the success of online social networks and microblogging platforms such as Facebook, Flickr and Twitter, the phenomenon of influence-driven propagations, has recently attracted the interest of computer scientists, information technologists, and marketing specialists. In this talk we take a data mining perspective and we discuss what (and how) can be learned from a social network and a database of traces of past propagations over the social network. Starting from one of the key problems in this area, i.e. the identification of influential users, by targeting whom certain desirable marketing outcomes can be achieved, we provide an overview of some recent progresses in this area and discuss some open problems.
December 10 Bren Hall 4011 1 pm	George Papandreou Postdoctoral Research Scholar Department of Statistics University of California, Los Angeles Random Sampling and Optimization in Probabilistic Modeling for Computer Vision Machine learning plays an increasingly important role in computer vision, allowing us to build complex vision systems that better capture the properties of images. Probabilistic Bayesian methods such as Markov random fields are well suited for describing ambiguous images and videos, providing us with the natural conceptual framework for representing the uncertainty in interpreting them and automatically learning model parameters from training data. However, Bayesian techniques pose significant computational challenges in computer vision applications and alternative deterministic energy minimization techniques are often preferred in practice. I will present a new computationally efficient probabilistic random field model, which can be best described as a “Perturb-and-MAP” generative process: We obtain a random sample from the whole field at once by first injecting noise into the system’s energy function, then solving an optimization problem to find the least energy configuration of the perturbed system. With Perturb-and-MAP random fields we thus turn powerful deterministic energy minimization methods into efficient probabilistic random sampling algorithms that bypass costly Markov-chain Monte-Carlo (MCMC) and can generate in a fraction of a second independent random samples from mega-pixel sized images. I will discuss how the Perturb-and-MAP model relates to the standard Gibbs MRF and how it can be used in conjunction with other approximate Bayesian computation techniques. I will illustrate these ideas with applications in image inpainting and deblurring, image segmentation, and scene labeling, showing how the Perturb-and-MAP model makes large-scale Bayesian inference computationally tractable for challenging computer vision problems. Speaker Bio: George Papandreou holds a Diploma (2003) and a Ph.D. (2009) in electrical and computer engineering from the National Technical University of Athens, Greece. Since 2009 he has been a postdoctoral research scholar at the University of California, Los Angeles. His research interests are in probabilistic machine learning, computer vision, and multimodal perception. He approaches these problems with methods from Bayesian statistics, signal processing, and applied mathematics.