Lecturers
Each Lecturer will hold up to four lectures on one or more research topics.
Topics
Foundation Models, Transformers, Representation Learning, Reinforcement Learning,Biography
Together with Xiaohua Zhai and Alexander Kolesnikov, I co-founded the Zürich OpenAI office, which made some news.
Before that, I was a Staff Research Scientist at Google DeepMind (formerly Brain) in Zürich, where I co-lead our multimodal research effort and codebase.
I have a growing list of publications at top tier conferences such as CVPR, NeurIPS, ICCV, … See my Google Scholar or Semantic Scholar pages for the full list of over 50. However, here’s a few of my favourite publications that you may have heard of, with a one-sentence TL;DR:
https://scholar.google.com/citations?user=p2gwhK4AAAAJ&hl=fr
Lectures
Biography
Giuseppe Di Fatta is a Full Professor at the Free University of Bozen-Bolzano (Italy) since 2022. From 2006 to 2021, he was with the University of Reading (UK), where he also served as Head of the Department of Computer Science from 2016 to 2021. Between 2004 and 2006, he was at the University of Konstanz (Germany), where he was part of the initial development team of KNIME, a widely used data science and machine learning platform. From 2000 to 2004, he worked with the High-Performance Computing and Networking Institute of the National Research Council of Italy, and in 1999 he was a research fellow at the International Computer Science Institute (ICSI) in Berkeley, California.
His research interests include artificial intelligence, machine learning algorithms, data science, and data-driven applications in both scientific and industrial domains. He has authored more than 140 peer-reviewed publications and has been a member of the IEEE since 2002 and a Fellow of the Higher Education Academy (UK) since 2009. He is also a member of the Technical Committee on Machine Learning (TC-ML) of the IEEE Systems, Man, and Cybernetics Society.
Lectures
Abstract TBA
Topics
Generative Models, Causality, Hypothesis Testing, Machine LearningBiography
Arthur Gretton is a Professor with the Gatsby Computational Neuroscience Unit, CSML, UCL, which he joined in 2010. He received degrees in physics and systems engineering from the Australian National University, and a PhD with Microsoft Research and the Signal Processing and Communications Laboratory at the University of Cambridge. He worked from 2002-2012 at the MPI for Biological Cybernetics, and from 2009-2010 at the Machine Learning Department, Carnegie Mellon University. Arthur’s research interests include machine learning, kernel methods, statistical learning theory, nonparametric hypothesis testing, blind source separation, Gaussian processes, and non-parametric techniques for neural data analysis. He has been an associate editor at IEEE Transactions on Pattern Analysis and Machine Intelligence from 2009 to 2013, an Action Editor for JMLR since April 2013, a member of the NIPS Program Committee in 2008 and 2009, a Senior Area Chair for NIPS in 2018, an Area Chair for ICML in 2011 and 2012, and a member of the COLT Program Committee in 2013. Arthur was co-chair of AISTATS in 2016 (with Christian Robert), and co-tutorials chair of ICML in 2018 (with Ruslan Salakhutdinov).
Lectures
Lectures
Networks and ensembles of networks are able to capture interactions and dependencies among variables or observations, providing simple and powerful modeling of phenomena in different fields. Graph embedding involves the projection of graphs into a vector space, while retaining their structural properties. We will review some among the several embedding techniques developed in recent years.
Graph Neural Networks (GNN) have been developed to learn low dimensional representations of nodes, subgraphs and graphs with complex node and edge features. These embeddings can then be used in several applications, ranging from feature extraction, graph clustering to classification models. In this lecture, we survey GNNs, also in the light of their interpretability and explainability.
This talk introduces to the usage of Large Language Models with Graph Neural Networks (GNNs). We will see how modern architectures transform static data into autonomous reasoning graphs, capable of deterministic, traceable, multi-hop reasoning.
Topics
Machine Learning, Generative Models, Reinforcement Learning, Video GamesBiography
I am a Partner Research Manager at Microsoft Research Cambridge, where I co-lead the People-Centric AI research area. My work focuses on generative AI, interactive media, and game intelligence, combining advances in machine learning with human-computer interaction, design, and social science. With my team we aim to create AI systems that empower people through collaboration, creativity, and play – unlocking new forms of interaction and addressing complex real-world challenges. I am passionate about driving interdisciplinary research that shapes the future of AI experiences across productivity, entertainment, and beyond.
Previously, I led the Game Intelligence team with a focus on machine learning research with a focus on video games, which now forms part of the broader People-Centric AI area.
I am proud to serve the academic research community in my current roles of Board Member (since 2022) and Secretary of the Board (since 2024) of the International Conference on Learning Representations (ICLR(opens in new tab)), and have previously served as Senior Program Chair (ICLR 2021) and General Chair (ICLR 2022).
As part of the Microsoft Research PhD Scholarship program, I have deeply enjoyed co-supervising, and successfully graduating, the following PhD students:
- David Lindner(opens in new tab) (ETH Zurich, Switzerland and Microsoft Joint Research Center) – co-supervision with Andreas Krause(opens in new tab)
- Rémy Portelas(opens in new tab) (Inria, Bordeaux, France) – co-supervision with Pierre-Yves Oudeyer(opens in new tab)
- Steindor Saemundsson(opens in new tab) (Imperial College London, UK) – co-supervision with Marc Deisenroth(opens in new tab)
- Laetitia Teodorescu(opens in new tab) (Inria, Bordeaux, France) – co-supervision with Pierre-Yves Oudeyer(opens in new tab)
- Luisa Zintgraf(opens in new tab) (University of Oxford, UK) – co-supervision with Shimon Whiteson(opens in new tab)
Before joining Microsoft Research, I completed my PhD in Computer Science as part of the former ILPS group at the University of Amsterdam(opens in new tab). I worked with Maarten de Rijke(opens in new tab) and Shimon Whiteson(opens in new tab) on smart search engines that learn directly from their users. For a list of my publications before joining MSR, please see the ILPS (Information and Language Processing Systems) list of publications(opens in new tab), MSR Academic, or dblp(opens in new tab).
Lectures
This lecture introduces world models as learned representations of environment dynamics, conceptualized as simulators that capture the temporal evolution of states. We will distinguish between world models as internal components of agents (e.g., supporting planning and decision-making) and world simulators as standalone generative systems (e.g., for creative uses). We will trace the historical development of the field and provide a structured overview of the architectural landscape, establishing a foundation for subsequent lectures.
This lecture covers the full pipeline for training a world simulator, from visual tokenization to large-scale generation, using games as a testbed. It examines the practical ecosystem around inference, evaluation, controllability, and contrasts leading architectural paradigms and their trade-offs.
While recent world simulators can generate visually and temporally coherent sequences, they often lack a deeper, causally grounded understanding of the environments they model. This lecture critically examines the central open challenges in the field, including maintaining long-horizon consistency, capturing causal structure and physical plausibility, and achieving robust generalization beyond the training distribution, while outlining potential future applications that can be unlocked as the field progresses to address these challenges.
Topics
Deep Learning, Gradient Descent Optimization Methods, Mathematical Analysis of the Gradients in Deep Learning, Adam Algorithm, Scientific Machine LearningBiography
Prof. Arnulf Jentzen is appointed as a presidential chair professor at the Chinese University of Hong Kong, Shenzhen (since 2021) and as a full professor at the University of Münster (since 2019). In 2004 he started his undergraduate studies in mathematics at Goethe University Frankfurt in Germany, in 2007 he received his diploma degree at this university, and in 2009 he completed his PhD in mathematics at this university. The core research topics of his research group are machine learning approximation algorithms, computational stochastics, numerical analysis for high dimensional partial differential equations (PDEs), stochastic analysis, and computational finance. Currently he serves in the editorial boards of several scientific journals such as the Annals of Applied Probability, Communications in Mathematical Sciences, the Journal of Machine Learning, the SIAM Journal on Scientific Computing, and the SIAM Journal on Numerical Analysis. In 2020 he was the recipient of the Felix Klein Prize of the European Mathematical Society (EMS), in 2022 he has been awarded an ERC Consolidator Grant from the European Research Council (ERC), and in 2022 he has been awarded the Joseph F. Traub Prize for Achievement in Information-Based Complexity. Further details on the activities of his research group can be found at the webpage http://www.ajentzen.de.
Lectures
In these lectures we present several selected basic results regarding the theoretical understanding of artificial intelligence (AI) methods and structures. Specifically, we first review popular stochastic optimization methods used for training AI models such as the standard stochastic gradient descent (SGD) method, the momentum method, the adaptive root mean square propagation (RMSprop) method, and the famous adaptive moment estimation (Adam) optimizer. In particular, we discuss the Adam symmetry theorem, the Adam vector field, and the Adam limit theorem, as well as convergence speeds and stability regions for different gradient based optimization methods. Thereafter, we also review the capabilities of deep neural networks (DNNs) to approximate certain high-dimensional functions such as solution functions of high-dimensional PDEs.
In these lectures we present several selected basic results regarding the theoretical understanding of artificial intelligence (AI) methods and structures. Specifically, we first review popular stochastic optimization methods used for training AI models such as the standard stochastic gradient descent (SGD) method, the momentum method, the adaptive root mean square propagation (RMSprop) method, and the famous adaptive moment estimation (Adam) optimizer. In particular, we discuss the Adam symmetry theorem, the Adam vector field, and the Adam limit theorem, as well as convergence speeds and stability regions for different gradient based optimization methods. Thereafter, we also review the capabilities of deep neural networks (DNNs) to approximate certain high-dimensional functions such as solution functions of high-dimensional PDEs.
In these lectures we present several selected basic results regarding the theoretical understanding of artificial intelligence (AI) methods and structures. Specifically, we first review popular stochastic optimization methods used for training AI models such as the standard stochastic gradient descent (SGD) method, the momentum method, the adaptive root mean square propagation (RMSprop) method, and the famous adaptive moment estimation (Adam) optimizer. In particular, we discuss the Adam symmetry theorem, the Adam vector field, and the Adam limit theorem, as well as convergence speeds and stability regions for different gradient based optimization methods. Thereafter, we also review the capabilities of deep neural networks (DNNs) to approximate certain high-dimensional functions such as solution functions of high-dimensional PDEs.
Topics
Data Science, Global Optimization, Mathematical Modeling, Financial Applications, AIBiography
Distinguished Emeritus Professor Panos Pardalos
University of Florida
Panos Pardalos was born in Drosato (Mezilo) Argitheas, Greece, in 1954 and graduated from Athens University (Department of Mathematics). He received his PhD in Computer and Information Sciences from the University of Minnesota. He is an Emeritus Distinguished Professor in the Department of Industrial and Systems Engineering at the University of Florida, and an affiliated faculty member in the Biomedical Engineering and Computer Science & Information Engineering departments. Since 2011, he has served as the academic advisor at LATNA, HSE.
Panos Pardalos is a world-renowned leader in Global Optimization, Mathematical Modeling, Energy Systems, Financial Applications, and Data Sciences. He is a Fellow of AAAS, AAIA, AIMBE, EUROPT, and INFORMS, and was awarded the 2013 Constantin Carathéodory Prize by the International Society of Global Optimization. In addition, he was awarded the 2013 EURO Gold Medal by the Association of European Operational Research Societies. This medal is the preeminent European award given to Operations Research (OR) professionals for “scientific contributions that stand the test of time.”
Professor Pardalos was also honored with the prestigious Humboldt Research Award (2018–2019). This award is granted in recognition of a researcher’s entire body of work—fundamental discoveries, new theories, and insights that have had a significant impact on their discipline.
Furthermore, he is a member of several Academies of Sciences and holds numerous honorary PhD degrees and affiliations. He is the Founding Editor of Optimization Letters and Energy Systems, and Co-Founder of the International Journal of Global Optimization, Computational Management Science, and Springer Nature Operations Research Forum. He has published over 600 journal papers and edited or authored over 200 books. As one of the most highly cited authors in his field, he has graduated 71 PhD students to date.
Further details can be found at: https://faculty.eng.ufl.edu/
Lectures
Panos Pardalos UF & LATNA
https://faculty.eng.ufl.edu/pardalos/publications/
This lecture examines the fundamental shift from isolated, monolithic systems to the expansive “Network of Networks” architecture that underpins modern global infrastructure. We move beyond traditional single-layer analysis to explore the intricate interdependencies among critical domains—for example, the Energy–Financial nexus, where real-time market signals influence grid stability, and the Transportation–Digital nexus, where autonomous logistics depend on ubiquitous communication.
Problems in networks of networks are far more complex than those in single networks. For example, in a single network, the propagation of failures can often be predicted and contained. In contrast, within a “Network of Networks,” such failures become exponentially more difficult to anticipate due to hidden interdependencies—connections that remain invisible until they trigger cascading and often unpredictable effects.
Topics
LLMs, Foundation Models, AI, NLP.Biography
Raniero Romagnoli is CTO of Almawave and CEO of OBDA Systems. He is an expert in Artificial Intelligence and Natural Language Processing both in the enterprise and academic world. He leads the company’s technology strategy by managing research and development teams. He actively participates in numerous national and international initiatives in the field of AI by collaborating with research centers and academies. He holds advanced courses in Data Science, Machine Learning and AI and is co-author of numerous scientific articles and international patents.
Lectures
Topics
Large Language Models, Reasoning, Foundation Models, Fine-tuning Large Language Models, Reinforcement Learning with Human Feedback, Test-Time ComputationBiography
Michal is the Founding Researcher at a stealth startup, tenured researcher at Inria, and a lecturer at MVA at ENS Paris-Saclay. Michal is primarily interested in designing algorithms that would require as little human supervision as possible. He works on methods and settings that are able to deal with minimal feedback, such as deep reinforcement learning, bandit algorithms, self-supervised learning, or self play. Michal has recently worked on representation learning, world models and deep (reinforcement) learning algorithms that have some theoretical underpinning. In the past he has also worked on sequential algorithms with structured decisions where exploiting the structure leads to provably faster learning. Michal is now working on a new generation of large language models (LLMs), in addition to providing algorithmic solutions for their scalable test-time inference, fine-tuning and alignment. He received his PhD in 2011 from the University of Pittsburgh, before getting a tenure at Inria in 2012 and co-creating Google DeepMind Paris with R. Munos. In 2024, he became a Principal Llama Scientist at Meta, building online reinforcement learning stack and research for Llama 3.
Lectures
Abstract TBA
Topics
Multimodal Models, Vision Language ModelsBiography
I’m a Research Scientist at Mistral AI, working on multi-modal language models. Previously, I completed a PhD in the VGG at Oxford University, working on representation learning in computer vision, where I was fortunate to be supervised by Andrew Zisserman and Andrea Vedaldi. During my PhD, I also spent time at Meta AI (FAIR): first with Ishan Misra in New York, and then in the Segment Anything team with Ross Girshick.
https://scholar.google.com/citations?user=lvuOknUAAAAJ&hl=en
Lectures
Topics
Artificial Intelligence, Machine Learning, Natural Language Processing, VisionBiography
Jason is a Research Scientist at Facebook, NY and a Visiting Research Professor at NYU. He earned his Ph.D. in machine learning at Royal Holloway, University of London and at AT&T Research in Red Bank, NJ. Previously, he was a researcher at Biowulf Technologies, a research scientist at the Max Planck Institute for Biological Cybernetics, Tuebingen, Germany, a research staff member at NEC Labs America, Princeton, and a research scientist at Google, NY. His interests lie in statistical machine learning, with a focus on reasoning, memory, perception, interaction, and communication. Jason has published over 100 papers, including Best Paper awards at ICML and ECML, and received a Test of Time Award for his work, “A Unified Architecture for Natural Language Processing: Deep Neural Networks with Multitask Learning,” (with Ronan Collobert). He was part of the YouTube team that won a National Academy of Television Arts & Sciences Emmy Award for Technology and Engineering for Personalized Recommendation Engines for Video Discovery. Jason was also listed as the 16th most influential machine learning scholar at AMiner and one of the top 50 authors in Computer Science in Science.
Lectures
Abstract TBA
Abstract TBA
Abstract TBA
Tutorial Speakers
Each Lecturer will hold up to four lectures on one or more research topics.
Lectures
Abstract TBA
Lectures
Abstract TBA
Lectures
Abstract TBA