I am a Postdoctoral Researcher at the Finnish Center for AI (FCAI), working with Luigi Acerbi (Machine and Human Intelligence Group) and Samuel Kaski (Probabilistic Machine Learning Group), along with other collaborators. My research focuses on probabilistic machine learning, with particular interests in amortized Bayesian inference, learning from synthetic data, and adaptive experimental design.
I did my PhD at the Bosch Center for AI & TU Berlin, working with Christoph Zimmer, Barbara Rakitsch, and Marc Toussaint. My PhD research focused on Bayesian active learning, particularly under constraints, using Gaussian processes, neural networks, and synthetic data.
Before this, I pursued a Master’s in Computational Neuroscience at the University of Tübingen, supported by the prestigious Tsung Cho Chang Foundation scholarship, awarded to only five students in Taiwan each year. My Master’s thesis, supervised by Martin Giese, developed an explainable neural network model for encoding biological motor control. I completed my undergraduate studies in mathematics at National Tsing Hua University and also spent a year developing database web interfaces while assisting in neurogenetics research.
Research interests
- Bayesian Inference, Bayesian Inference with Deep Learning
- Active Learning, Bayesian Optimization, Adaptive Experimental Design
- Amortized / Pretrained Bayesian Approaches
Selected publications
preprint
Amortized Safe Active Learning for Real-Time Data Acquisition: Pretrained Neural Policies from Simulated Nonparametric Functions
Cen-You Li , Marc Toussaint , Barbara Rakitsch* , Christoph Zimmer*
* equal contribution.
preprint. 2025.
ABS ARXIV
Safe active learning (AL) is a sequential scheme for learning unknown systems while respecting safety constraints during data acquisition. Existing methods often rely on Gaussian processes (GPs) to model the task and safety constraints, requiring repeated GP updates and constrained acquisition optimization—incurring in significant computations which are challenging for real-time decision-making. We propose an amortized safe AL framework that replaces expensive online computations with a pretrained neural policy. Inspired by recent advances in amortized Bayesian experimental design, we turn GPs into a pretraining simulator. We train our policy prior to the AL deployment on simulated nonparametric functions, using Fourier feature-based GP sampling and a differentiable, safety-aware acquisition objective. At deployment, our policy selects safe and informative queries via a single forward pass, eliminating the need for GP inference or constrained optimization. This leads to substantial speed improvements while preserving safety and learning quality. Our framework is modular and can be adapted to unconstrained, time-sensitive AL tasks by omitting the safety requirement.
TMLR 2025
Global Safe Sequential Learning via Efficient Knowledge Transfer
Cen-You Li , Olaf Duennbier , Marc Toussaint , Barbara Rakitsch* , Christoph Zimmer*
* equal contribution.
Transactions on Machine Learning Research (TMLR). 2025.
ABS HTML ARXIV CODE
Sequential learning methods, such as active learning and Bayesian optimization, aim to select the most informative data for task learning. In many applications, however, data selection is constrained by unknown safety conditions, motivating the development of safe learning approaches. A promising line of safe learning methods uses Gaussian processes to model safety conditions, restricting data selection to areas with high safety confidence. However, these methods are limited to local exploration around an initial seed dataset, as safety confidence centers around observed data points. As a consequence, task exploration is slowed down and safe regions disconnected from the initial seed dataset remain unexplored. In this paper, we propose safe transfer sequential learning to accelerate task learning and to expand the explorable safe region. By leveraging abundant offline data from a related source task, our approach guides exploration in the target task more effectively. We also provide a theoretical analysis to explain why single-task method cannot cope with disconnected regions. Finally, we introduce a computationally efficient approximation of our method that reduces runtime through pre-computations. Our experiments demonstrate that this approach, compared to state-of-the-art methods, learns tasks with lower data consumption and enhances global exploration across multiple disjoint safe regions, while maintaining comparable computational efficiency.
AISTATS 2022
Safe Active Learning for Multi-Output Gaussian Processes
Cen-You Li , Barbara Rakitsch , Christoph Zimmer
International Conference on Artificial Intelligence and Statistics (AISTATS). 2022.
ABS HTML ARXIV CODE
Multi-output regression problems are commonly encountered in science and engineering. In particular, multi-output Gaussian processes have been emerged as a promising tool for modeling these complex systems since they can exploit the inherent correlations and provide reliable uncertainty estimates. In many applications, however, acquiring the data is expensive and safety concerns might arise (e.g. robotics, engineering). We propose a safe active learning approach for multi-output Gaussian process regression. This approach queries the most informative data or output taking the relatedness between the regressors and safety constraints into account. We prove the effectiveness of our approach by providing theoretical analysis and by demonstrating empirical results on simulated datasets and on a real-world engineering dataset. On all datasets, our approach shows improved convergence compared to its competitors.