I am a Postdoctoral Researcher at the Finnish Center for AI (FCAI), working with Luigi Acerbi (Machine and Human Intelligence Group) and Samuel Kaski (Probabilistic Machine Learning Group), along with other collaborators. My research focuses on probabilistic machine learning, with particular interests in amortized Bayesian inference, learning from synthetic data, and adaptive experimental design.
I did my PhD at the Bosch Center for AI & TU Berlin, working with Christoph Zimmer, Barbara Rakitsch, and Marc Toussaint. My PhD research focused on Bayesian active learning, particularly under constraints, using Gaussian processes, neural networks, and synthetic data.
Before this, I pursued a Master’s in Computational Neuroscience at the University of Tübingen, supported by the prestigious Tsung Cho Chang Foundation scholarship, awarded to only five students in Taiwan each year. My Master’s thesis, supervised by Martin Giese, developed an explainable neural network model for encoding biological motor control. I completed my undergraduate studies in mathematics at National Tsing Hua University and also spent a year developing database web interfaces while assisting in neurogenetics research.
Research interests
- Bayesian Inference, Bayesian Inference with Deep Learning
- Active Learning, Bayesian Optimization, Adaptive Experimental Design
- Amortized / Pretrained Bayesian Approaches
Selected publications
AISTATS 2026
Amortized Safe Active Learning for Real-Time Data Acquisition: Pretrained Neural Policies from Simulated Nonparametric Functions
Cen-You Li , Marc Toussaint , Barbara Rakitsch* , Christoph Zimmer*
* equal contribution.
International Conference on Artificial Intelligence and Statistics (AISTATS). 2026.
ABS HTML ARXIV CODE
Safe active learning (AL) is a sequential scheme for learning unknown systems while respecting safety constraints during data acquisition. Existing methods often rely on Gaussian processes (GPs) to model the task and safety constraints, requiring repeated GP updates and constrained acquisition optimization--incurring significant computations which are challenging for real-time decision-making. We propose amortized AL for regression and amortized safe AL, replacing expensive online computations with a pretrained neural policy. Inspired by recent advances in amortized Bayesian experimental design, we leverage GPs as pretraining simulators. We train our policy prior to the AL deployment on simulated nonparametric functions, using Fourier feature-based GP sampling and a differentiable acquisition objective that is safety-aware in the safe AL setting. At deployment, our policy selects informative and (if desired) safe queries via a single forward pass, eliminating GP inference and acquisition optimization. This leads to magnitudes of speed improvements while preserving learning quality. Our framework is modular and, without the safety component, yields fast unconstrained AL for time-sensitive tasks.
ICLR 2026
Efficient Autoregressive Inference for Transformer Probabilistic Models
Conor Hassan* , Nasrulloh Loka* , Cen-You Li , Daolang Huang , Paul E. Chang , Yang Yang , Francesco Silvestrin , Samuel Kaski , Luigi Acerbi
* equal contribution.
International Conference on Learning Representations (ICLR). 2026.
ABS HTML ARXIV CODE
Set-based transformer models for amortized probabilistic inference and meta-learning, such as neural processes, prior-fitted networks, and tabular foundation models, excel at single-pass marginal prediction. However, many applications require joint distributions over multiple predictions. Purely autoregressive architectures generate these efficiently but sacrifice flexible set-conditioning. Obtaining joint distributions from set-based models requires re-encoding the entire context at each autoregressive step, which scales poorly. We introduce a causal autoregressive buffer that combines the strengths of both paradigms. The model encodes the context once and caches it; a lightweight causal buffer captures dependencies among generated targets, with each new prediction attending to both the cached context and all previously predicted targets added to the buffer. This enables efficient batched autoregressive sampling and joint predictive density evaluation. Training integrates set-based and autoregressive modes through masked attention at minimal overhead. Across synthetic functions, EEG time series, a Bayesian model comparison task, and tabular regression, our method closely matches the performance of full context re-encoding while delivering up to $20 imes$ faster joint sampling and density evaluation, and up to $7 imes$ lower memory usage.
TMLR 2025
Global Safe Sequential Learning via Efficient Knowledge Transfer
Cen-You Li , Olaf Duennbier , Marc Toussaint , Barbara Rakitsch* , Christoph Zimmer*
* equal contribution.
Transactions on Machine Learning Research (TMLR). 2025.
ABS HTML ARXIV CODE
Sequential learning methods, such as active learning and Bayesian optimization, aim to select the most informative data for task learning. In many applications, however, data selection is constrained by unknown safety conditions, motivating the development of safe learning approaches. A promising line of safe learning methods uses Gaussian processes to model safety conditions, restricting data selection to areas with high safety confidence. However, these methods are limited to local exploration around an initial seed dataset, as safety confidence centers around observed data points. As a consequence, task exploration is slowed down and safe regions disconnected from the initial seed dataset remain unexplored. In this paper, we propose safe transfer sequential learning to accelerate task learning and to expand the explorable safe region. By leveraging abundant offline data from a related source task, our approach guides exploration in the target task more effectively. We also provide a theoretical analysis to explain why single-task method cannot cope with disconnected regions. Finally, we introduce a computationally efficient approximation of our method that reduces runtime through pre-computations. Our experiments demonstrate that this approach, compared to state-of-the-art methods, learns tasks with lower data consumption and enhances global exploration across multiple disjoint safe regions, while maintaining comparable computational efficiency.
AISTATS 2022
Safe Active Learning for Multi-Output Gaussian Processes
Cen-You Li , Barbara Rakitsch , Christoph Zimmer
International Conference on Artificial Intelligence and Statistics (AISTATS). 2022.
ABS HTML ARXIV CODE
Multi-output regression problems are commonly encountered in science and engineering. In particular, multi-output Gaussian processes have been emerged as a promising tool for modeling these complex systems since they can exploit the inherent correlations and provide reliable uncertainty estimates. In many applications, however, acquiring the data is expensive and safety concerns might arise (e.g. robotics, engineering). We propose a safe active learning approach for multi-output Gaussian process regression. This approach queries the most informative data or output taking the relatedness between the regressors and safety constraints into account. We prove the effectiveness of our approach by providing theoretical analysis and by demonstrating empirical results on simulated datasets and on a real-world engineering dataset. On all datasets, our approach shows improved convergence compared to its competitors.