Weekly Seminar in AI & Machine Learning
Sponsored by the HPI Research Center in Machine Learning and Data Science at UC Irvine
Oct. 7 DBH 4011 1 pm |
We envision a world where AI agents (assistants) are widely used for complex tasks in our digital and physical worlds and are broadly integrated into our society. To move towards such a future, we need an environment for a robust evaluation of agents’ capability, reliability, and trustworthiness. In this talk, I’ll introduce AppWorld, which is a step towards this goal in the context of day-to-day digital tasks. AppWorld is a high-fidelity simulated world of people and their digital activities on nine apps like Amazon, Gmail, and Venmo. On top of this fully controllable world, we build a benchmark of complex day-to-day tasks such as splitting Venmo bills with roommates, which agents have to solve via interactive coding and API calls. One of the fundamental challenges with complex tasks lies in accounting for different ways in which the tasks can be completed. I will describe how we address this challenge using a reliable and programmatic evaluation framework. Our benchmarking evaluations show that even the best LLMs, like GPT-4o, can only solve ~30% of such tasks, highlighting the challenging nature of the AppWorld benchmark. I will conclude by laying out exciting future research that can be conducted on the foundation of AppWorld, such as benchmarks and playground for developing multimodal, collaborative, safe, socially intelligent, resourceful, and fail-tolerant agents. Bio: Harsh Trivedi is a final year PhD student at Stony Brook University, advised by Niranjan Balasubramanian. He is broadly interested in the development of reliable, explainable AI systems and their rigorous evaluation. Specifically, his research spans the domains of AI agents, multi-step reasoning, AI safety, and efficient NLP. He has interned at AI2 and was a visiting researcher at NYU. His recent work, AppWorld, received a Best Resource Paper award at ACL’24, and his work on AI safety via debate received a Best Paper award at the ML Safety workshop at NeurIPS’22. |
Oct. 14 DBH 4011 1 pm |
Diffusion models exhibit excellent sample quality across multiple-generation tasks. However, their inference process is iterative and often requires hundreds of function evaluations. Moreover, it is unclear if existing methods for accelerating diffusion model sampling can generalize well across different types of diffusion processes. In the first part of my talk, I will introduce Conjugate Integrators, which project unconditional diffusion dynamics to an alternate space that is more amenable to faster sampling. The resulting framework possesses several interesting theoretical connections with prior work in fast diffusion sampling, enabling their application to a broader class of diffusion processes. In the second part of my talk, I will extend the idea of Conjugate Integrators from unconditional sampling to conditional diffusion sampling in the context of solving inverse problems. Empirically, on challenging inverse problems like 4x super-resolution on the ImageNet-256 dataset, conditional Conjugate Integrators can generate high-quality samples in as few as 5 conditional sampling steps, providing significant speedups over prior work.
Bio: Kushagra is a third-year PhD student in Computer Science at UCI, advised by Prof. Stephan Mandt. Previously, he completed his bachelor’s and master’s degrees in Computer Science from the Indian Institute of Technology. He is broadly interested in the efficient design and inference in deep generative models with a current focus on iterative refinement models like Diffusion Models and Stochastic Interpolants. |
Oct. 21 DBH 4011 1 pm |
Mobile health (mHealth) interventions, such as text messages and push notifications targeting behavior change, are a promising alternative to in-person healthcare. Understanding how the effect of an mHealth intervention varies over time and with contextual information is critical for optimizing the intervention and advancing domain knowledge. We discuss two projects that showcase the use of causal inference and machine learning in answering such questions. In the first project, we assess how a push notification suggesting physical activity influences individuals’ step counts using data from a micro-randomized trial. We propose the first semiparametric causal excursion effect model with varying coefficients to model the time-varying effects within a decision point and across decision points. Our analysis reveals new insights into individuals’ change in response profiles due to the activity suggestions. In the second project, we study the theoretical limit of efficient estimation (i.e., semiparametric efficiency) for the causal effects of mHealth interventions. We propose a class of two-stage estimators that achieve the efficiency bound. Through real data applications and numerical experiments, we show how supervised learning and cross-fitting lead to substantial variance reduction and robustness against misspecified working models.
Bio: Tianchen Qian is an Assistant Professor in Statistics at UC Irvine. His research focuses on leveraging data science, mobile technology, and wearable devices to design robust, personalized, and cost-effective interventions that can impact health and well-being at a significant scale. He also works on causal inference, experimental design, machine learning, semiparametric efficiency theory, and longitudinal data methods. He has a PhD in Biostatistics from Johns Hopkins University. Before joining UCI, he was a postdoc fellow in Statistics at Harvard University. |
Oct. 28 DBH 4011 1 pm |
Jana Lipkova Assistant Professor, Department of Pathology School of Medicine, University of California, Irvine To Be Announced. |
Nov. 4 DBH 4011 1 pm |
“Future Health” emphasizes the importance of recognizing each individual’s uniqueness, which arises from their specific omics, lifestyle, environmental, and socioeconomic conditions. Thanks to advancements in sensors, mobile computing, ubiquitous computing, and artificial intelligence (AI), we can now collect detailed information about individuals. This data serves as the foundation for creating personal models, offering predictive and preventive advice tailored specifically to each person. These models enable us to provide precise recommendations that closely align with the individual’s predicted needs. In my presentation, I will explore how AI, including generative AI, and wearable technology are revolutionizing the collection and analysis of big health data in everyday environments. I will discuss the analytics used to evaluate physical and mental health and how smart recommendations can be made objectively. Moreover, I will illustrate how leveraging Large Language Models (LLMs)-powered conversational health agents (CHAs) can integrate personal data, models, and knowledge into healthcare chatbots. Additionally, I will present our open-source initiative on developing OpenCHA (openCHA.com). This integration allows for creating personalized chatbots, enhancing the delivery of health guidance directly tailored to the individual.
Bio: Amir M. Rahmani is the founder of the Health SciTech Group at the University of California, Irvine (UCI) and the co-founder and co-director of the Institute for Future Health, a campus-wide Organized Research Unit at UCI. He is also a lifetime docent (Adjunct Professor) at the University of Turku (UTU), Finland. His research includes AI in healthcare, ubiquitous computing, AI-powered bio-signal processing, health informatics, and big health data analytics. He has been leading several NSF, NIH, Academy of Finland, and European Commission-funded projects on Smart Pain Assessment, Community-Centered Care, Family-centered Maternity Care, Stress Management in Adolescents, and Remote Elderly and Family Caregivers Monitoring. He is the co-author of more than 350 peer-reviewed publications and the associate editor-in-chief of ACM Transactions on Computing for Healthcare and Frontiers in Wearable Electronics journals and the Editorial Board of Nature Scientific Reports. He is a distinguished member of the ACM and a senior member of the IEEE. |
Nov. 11 |
No Seminar (Veterans Day Holiday)
|
Nov. 18 DBH 4011 1 pm |
To Be Announced. |
Nov. 25 DBH 4011 1 pm |
To Be Announced. |
Dec. 2 DBH 4011 1 pm |
Language model agents are tackling challenging tasks from embodied planning to web navigation to programming. These models are a powerful artifact of natural language processing research that are being applied to interactive environments traditionally reserved for reinforcement learning. However, many environments are not natively expressed in language, resulting in poor alignment between language representations and true states and actions. Additionally, while language models are generally capable, their biases from pretraining can be unaligned with specific environment dynamics. In this talk, I cover our research into rectifying these issues through methods such as: (1) mapping high-level language model plans to low-level actions, (2) optimizing language model agent inputs using reinforcement learning, and (3) in-context policy improvement for continual task adaptation. |