← Back to Homepage

Reinforced Learning

强化学习与人类反馈研究

📊 50 Papers 📅 Updated: 2026-04-01
1
Aligned, Orthogonal or In-conflict: When can we safely optimize Chain-of-Thought?
Max Kaufmann, David Lindner, Roland S. Zimmermann et al. (4 authors)
📅 2026-03-31
Chain-of-Thought (CoT) monitoring, in which automated systems monitor the CoT of an LLM, is a promising approach for effectively overseeing AI systems. However, the extent to which a model's CoT helps us oversee the model - the monitorability of the CoT - can be affected by training, for instance by the model learning to hide important features of its reasoning. We propose and empirically...
2
Reward-Based Online LLM Routing via NeuralUCB
Ming-Hua Tsai, Phat Tran
📅 2026-03-31
This study investigates the use of NeuralUCB for cost-aware large language model (LLM) routing. Existing routing approaches can be broadly grouped into supervised routing methods and partial-feedback methods, each with different tradeoffs in efficiency and adaptivity. We implement a NeuralUCB-based routing policy and evaluate it on RouterBench under a simulated online setting. Experimental...
3
Tucker Attention: A generalization of approximate attention mechanisms
Timon Klein, Jonas Kusch, Sebastian Sager et al. (5 authors)
📅 2026-03-31
The pursuit of reducing the memory footprint of the self-attention mechanism in multi-headed self attention (MHA) spawned a rich portfolio of methods, e.g., group-query attention (GQA) and multi-head latent attention (MLA). The methods leverage specialized low-rank factorizations across embedding dimensions or attention heads. From the point of view of classical low-rank approximation, these...
4
Refined Detection for Gumbel Watermarking
Tor Lattimore
📅 2026-03-31
We propose a simple detection mechanism for the Gumbel watermarking scheme proposed by Aaronson (2022). The new mechanism is proven to be near-optimal in a problem-dependent sense among all model-agnostic watermarking schemes under the assumption that the next-token distribution is sampled i.i.d.
5
Tracking Equivalent Mechanistic Interpretations Across Neural Networks
Alan Sun, Mariya Toneva
📅 2026-03-31
Mechanistic interpretability (MI) is an emerging framework for interpreting neural networks. Given a task and model, MI aims to discover a succinct algorithmic process, an interpretation, that explains the model's decision process on that task. However, MI is difficult to scale and generalize. This stems in part from two key challenges: there is no precise notion of a valid interpretation;...
6
Aligning Validation with Deployment: Target-Weighted Cross-Validation for Spatial Prediction
Alexander Brenning, Thomas Suesse
📅 2026-03-31
Cross-validation (CV) is commonly used to estimate predictive risk when independent test data are unavailable. Its validity depends on the assumption that validation tasks are sampled from the same distribution as prediction tasks encountered during deployment. In spatial prediction and other settings with structured data, this assumption is frequently violated, leading to biased estimates of...
7
Quantifying Cross-Modal Interactions in Multimodal Glioma Survival Prediction via InterSHAP: Evidence for Additive Signal Integration
Iain Swift, JingHua Ye, Ruairi O'Reilly
📅 2026-03-31
Multimodal deep learning for cancer prognosis is commonly assumed to benefit from synergistic cross-modal interactions, yet this assumption has not been directly tested in survival prediction settings. This work adapts InterSHAP, a Shapley interaction index-based metric, from classification to Cox proportional hazards models and applies it to quantify cross-modal interactions in glioma survival...
8
Meteorology-Driven GPT4AP: A Multi-Task Forecasting LLM for Atmospheric Air Pollution in Data-Scarce Settings
Prasanjit Dey, Soumyabrata Dev, Bianca Schoen-Phelan
📅 2026-03-31
Accurate forecasting of air pollution is important for environmental monitoring and policy support, yet data-driven models often suffer from limited generalization in regions with sparse observations. This paper presents Meteorology-Driven GPT for Air Pollution (GPT4AP), a parameter-efficient multi-task forecasting framework based on a pre-trained GPT-2 backbone and Gaussian rank-stabilized...
9
Do covariates explain why these groups differ? The choice of reference group can reverse conclusions in the Oaxaca-Blinder decomposition
Manuel Quintero, Advik Shreekumar, William T. Stephenson et al. (4 authors)
📅 2026-03-31
Scientists often want to explain why an outcome is different in two groups. For instance, differences in patient mortality rates across two hospitals could be due to differences in the patients themselves (covariates) or differences in medical care (outcomes given covariates). The Oaxaca--Blinder decomposition (OBD) is a standard tool to tease apart these factors. It is well known that the OBD...
10
Think Anywhere in Code Generation
Xue Jiang, Tianyu Zhang, Ge Li et al. (11 authors)
📅 2026-03-31
Recent advances in reasoning Large Language Models (LLMs) have primarily relied on upfront thinking, where reasoning occurs before final answer. However, this approach suffers from critical limitations in code generation, where upfront thinking is often insufficient as problems' full complexity only reveals itself during code implementation. Moreover, it cannot adaptively allocate reasoning...
11
Real-Time Explanations for Tabular Foundation Models
Luan Borges Teodoro Reis Sena, Francisco Galuppo Azevedo
📅 2026-03-31
Interpretability is central for scientific machine learning, as understanding \emph{why} models make predictions enables hypothesis generation and validation. While tabular foundation models show strong performance, existing explanation methods like SHAP are computationally expensive, limiting interactive exploration. We introduce ShapPFN, a foundation model that integrates Shapley value...
12
Better than Average: Spatially-Aware Aggregation of Segmentation Uncertainty Improves Downstream Performance
Vanessa Emanuela Guarino, Claudia Winklmayr, Jannik Franzen et al. (10 authors)
📅 2026-03-31
Uncertainty Quantification (UQ) is crucial for ensuring the reliability of automated image segmentations in safety-critical domains like biomedical image analysis or autonomous driving. In segmentation, UQ generates pixel-wise uncertainty scores that must be aggregated into image-level scores for downstream tasks like Out-of-Distribution (OoD) or failure detection. Despite routine use of...
13
End-to-End Image Compression with Segmentation Guided Dual Coding for Wind Turbines
Raül Pérez-Gonzalo, Andreas Espersen, Søren Forchhammer et al. (4 authors)
📅 2026-03-31
Transferring large volumes of high-resolution images during wind turbine inspections introduces a bottleneck in assessing and detecting severe defects. Efficient coding must preserve high fidelity in blade regions while aggressively compressing the background. In this work, we propose an end-to-end deep learning framework that jointly performs segmentation and dual-mode (lossy and lossless)...
14
Uncertainty Gating for Cost-Aware Explainable Artificial Intelligence
Georgii Mikriukov, Grégoire Montavon, Marina M. -C. Höhne
📅 2026-03-31
Post-hoc explanation methods are widely used to interpret black-box predictions, but their generation is often computationally expensive and their reliability is not guaranteed. We propose epistemic uncertainty as a low-cost proxy for explanation reliability: high epistemic uncertainty identifies regions where the decision boundary is poorly defined and where explanations become unstable and...
15
Task Scarcity and Label Leakage in Relational Transfer Learning
Francisco Galuppo Azevedo, Clarissa Lima Loures, Denis Oliveira Correa
📅 2026-03-31
Training relational foundation models requires learning representations that transfer across tasks, yet available supervision is typically limited to a small number of prediction targets per database. This task scarcity causes learned representations to encode task-specific shortcuts that degrade transfer even within the same schema, a problem we call label leakage. We study this using K-Space, a...
16
$p$-adic Character Neural Network
Tomoki Mihara
📅 2026-03-31
We propose a new frame work of $p$-adic neural network. Unlike the original $p$-adic neural network by S.\ Albeverio, A.\ Khrennikov, and B.\ Tirrozi using a family of characteristic functions indexed by hyperparameters of precision as activation functions, we use a single injective $p$-adic character on the topological Abelian group $\mathbb{Z}_p$ of $p$-adic integers as an activation function....
17
Penalized GMM Framework for Inference on Functionals of Nonparametric Instrumental Variable Estimators
Edvard Bakhitov
📅 2026-03-31
This paper develops a penalized GMM (PGMM) framework for automatic debiased inference on functionals of nonparametric instrumental variable estimators. We derive convergence rates for the PGMM estimator and provide conditions for root-n consistency and asymptotic normality of debiased functional estimates, covering both linear and nonlinear functionals. Monte Carlo experiments on average...
18
DIAL: Decoupling Intent and Action via Latent World Modeling for End-to-End VLA
Yi Chen, Yuying Ge, Hui Zhou et al. (6 authors)
📅 2026-03-31
The development of Vision-Language-Action (VLA) models has been significantly accelerated by pre-trained Vision-Language Models (VLMs). However, most existing end-to-end VLAs treat the VLM primarily as a multimodal encoder, directly mapping vision-language features to low-level actions. This paradigm underutilizes the VLM's potential in high-level decision making and introduces training...
19
Toward Generalizable Whole Brain Representations with High-Resolution Light-Sheet Data
Minyoung E. Kim, Dae Hee Yun, Aditi V. Patel et al. (7 authors)
📅 2026-03-31
Unprecedented visual details of biological structures are being revealed by subcellular-resolution whole-brain 3D microscopy data, enabled by recent advances in intact tissue processing and light-sheet fluorescence microscopy (LSFM). These volumetric data offer rich morphological and spatial cellular information, however, the lack of scalable data processing and analysis methods tailored to these...
20
DiSGMM: A Method for Time-varying Microscopic Weight Completion on Road Networks
Yan Lin, Jilin Hu, Shengnan Guo et al. (6 authors)
📅 2026-03-31
Microscopic road-network weights represent fine-grained, time-varying traffic conditions obtained from individual vehicles. An example is travel speeds associated with road segments as vehicles traverse them. These weights support tasks including traffic microsimulation and vehicle routing with reliability guarantees. We study the problem of time-varying microscopic weight completion. During a...
21
Curvature-Guided LoRA: Steering in the pretrained NTK subspace
Frédéric Zheng, Alexandre Proutière
📅 2026-03-31
Parameter-efficient fine-tuning methods such as LoRA enable efficient adaptation of large pretrained models but often fall short of full fine-tuning performance. Existing approaches focus on aligning parameter updates, which only indirectly control model predictions. In this work, we introduce the prediction alignment problem, aiming to match the predictor obtained via PEFT to that of full...
22
Loss Gap Parity for Fairness in Heterogeneous Federated Learning
Brahim Erraji, Michaël Perrot, Aurélien Bellet
📅 2026-03-31
While clients may join federated learning to improve performance on data they rarely observe locally, they often remain self-interested, expecting the global model to perform well on their own data. This motivates an objective that ensures all clients achieve a similar loss gap -the difference in performance between the global model and the best model they could train using only their local...
23
AMShortcut: An Inference- and Training-Efficient Inverse Design Model for Amorphous Materials
Yan Lin, Jonas A. Finkler, Tao Du et al. (5 authors)
📅 2026-03-31
Amorphous materials are solids that lack long-range atomic order but possess complex short- and medium-range order. Unlike crystalline materials that can be described by unit cells containing few up to hundreds of atoms, amorphous materials require larger simulation cells with at least hundreds or often thousands of atoms. Inverse design of amorphous materials with probabilistic generative models...
24
From Density Matrices to Phase Transitions in Deep Learning: Spectral Early Warnings and Interpretability
Max Hennick, Guillaume Corlouer
📅 2026-03-31
A key problem in the modern study of AI is predicting and understanding emergent capabilities in models during training. Inspired by methods for studying reactions in quantum chemistry, we present the ``2-datapoint reduced density matrix". We show that this object provides a computationally efficient, unified observable of phase transitions during training. By tracking the eigenvalue...
25
Multimodal Machine Learning for Early Prediction of Metastasis in a Swedish Multi-Cancer Cohort
Franco Rugolon, Korbinian Randl, Braslav Jovanovic et al. (5 authors)
📅 2026-03-31
Multimodal Machine Learning offers a holistic view of a patient's status, integrating structured and unstructured data from electronic health records (EHR). We propose a framework to predict metastasis risk one month prior to diagnosis, using six months of clinical history from EHR data. Data from four cancer cohorts collected at Karolinska University Hospital (Stockholm, Sweden) were...
26
Reasoning-Driven Synthetic Data Generation and Evaluation
Tim R. Davidson, Benoit Seguin, Enrico Bacis et al. (5 authors)
📅 2026-03-31
Although many AI applications of interest require specialized multi-modal models, relevant data to train such models is inherently scarce or inaccessible. Filling these gaps with human annotators is prohibitively expensive, error-prone, and time-consuming, leading model builders to increasingly consider synthetic data as a scalable alternative. However, existing synthetic data generation methods...
27
Big2Small: A Unifying Neural Network Framework for Model Compression
Jing-Xiao Liao, Haoran Wang, Tao Li et al. (7 authors)
📅 2026-03-31
With the development of foundational models, model compression has become a critical requirement. Various model compression approaches have been proposed such as low-rank decomposition, pruning, quantization, ergodic dynamic systems, and knowledge distillation, which are based on different heuristics. To elevate the field from fragmentation to a principled discipline, we construct a unifying...
28
Training-Free Dynamic Upcycling of Expert Language Models
Eros Fanì, Oğuzhan Ersoy
📅 2026-03-31
Large Language Models (LLMs) have achieved remarkable performance on a wide range of specialized tasks, exhibiting strong problem-solving capabilities. However, training these models is prohibitively expensive, and they often lack domain-specific expertise because they rely on general knowledge datasets. Expertise finetuning can address this issue; however, it often leads to overspecialization,...
29
One-for-All: A Lightweight Stabilized and Parameter-Efficient Pre-trained LLM for Time Series Forecasting
Prasanjit Dey, Soumyabrata Dev, Bianca Schoen-Phelan
📅 2026-03-31
We address the challenge of adapting pre-trained Large Language Models (LLMs) for multivariate time-series analysis, where their deployment is often hindered by prohibitive computational and memory demands. Our solution, One-for-All, introduces Gaussian Rank-Stabilized Low-Rank Adapters (rsLoRA) to enable parameter-efficient fine-tuning of frozen LLMs. While inspired by LoRA, rsLoRA introduces a...
30
HyperKKL: Learning KKL Observers for Non-Autonomous Nonlinear Systems via Hypernetwork-Based Input Conditioning
Yahia Salaheldin Shaaban, Abdelrahman Sayed Sayed, M. Umar B. Niazi et al. (4 authors)
📅 2026-03-31
Kazantzis-Kravaris/Luenberger (KKL) observers are a class of state observers for nonlinear systems that rely on an injective map to transform the nonlinear dynamics into a stable quasi-linear latent space, from where the state estimate is obtained in the original coordinates via a left inverse of the transformation map. Current learning-based methods for these maps are designed exclusively for...
31
mlr3mbo: Bayesian Optimization in R
Marc Becker, Lennart Schneider, Martin Binder et al. (5 authors)
📅 2026-03-31
We present mlr3mbo, a comprehensive and modular toolbox for Bayesian optimization in R. mlr3mbo supports single- and multi-objective optimization, multi-point proposals, batch and asynchronous parallelization, input and output transformations, and robust error handling. While it can be used for many standard Bayesian optimization variants in applied settings, researchers can also construct custom...
32
Unbounded Density Ratio Estimation and Its Application to Covariate Shift Adaptation
Ren-Rui Liu, Jun Fan, Lei Shi et al. (4 authors)
📅 2026-03-31
This paper focuses on the problem of unbounded density ratio estimation -- an understudied yet critical challenge in statistical learning -- and its application to covariate shift adaptation. Much of the existing literature assumes that the density ratio is either uniformly bounded or unbounded but known exactly. These conditions are often violated in practice, creating a gap between theoretical...
33
Nonnegative Matrix Factorization in the Component-Wise L1 Norm for Sparse Data
Giovanni Seraghiti, Kévin Dubrulle, Arnaud Vandaele et al. (4 authors)
📅 2026-03-31
Nonnegative matrix factorization (NMF) approximates a nonnegative matrix, $X$, by the product of two nonnegative factors, $WH$, where $W$ has $r$ columns and $H$ has $r$ rows. In this paper, we consider NMF using the component-wise L1 norm as the error measure (L1-NMF), which is suited for data corrupted by heavy-tailed noise, such as Laplace noise or salt and pepper noise, or in the presence of...
34
Symphony for Medical Coding: A Next-Generation Agentic System for Scalable and Explainable Medical Coding
Joakim Edin, Andreas Motzfeldt, Simon Flachs et al. (4 authors)
📅 2026-03-31
Medical coding translates free-text clinical documentation into standardized codes drawn from classification systems that contain tens of thousands of entries and are updated annually. It is central to billing, clinical research, and quality reporting, yet remains largely manual, slow, and error-prone. Existing automated approaches learn to predict a fixed set of codes from labeled data, thereby...
35
Mind the Gap: A Framework for Assessing Pitfalls in Multimodal Active Learning
Dustin Eisenhardt, Yunhee Jeong, Florian Buettner
📅 2026-03-31
Multimodal learning enables neural networks to integrate information from heterogeneous sources, but active learning in this setting faces distinct challenges. These include missing modalities, differences in modality difficulty, and varying interaction structures. These are issues absent in the unimodal case. While the behavior of active learning strategies in unimodal settings is well...
36
A Comprehensive Information-Decomposition Analysis of Large Vision-Language Models
Lixin Xiu, Xufang Luo, Hideki Nakayama
📅 2026-03-31
Large vision-language models (LVLMs) achieve impressive performance, yet their internal decision-making processes remain opaque, making it difficult to determine if the success stems from true multimodal fusion or from reliance on unimodal priors. To address this attribution gap, we introduce a novel framework using partial information decomposition (PID) to quantitatively measure the...
37
Concept frustration: Aligning human concepts and machine representations
Enrico Parisini, Christopher J. Soelistyo, Ahab Isaac et al. (5 authors)
📅 2026-03-31
Aligning human-interpretable concepts with the internal representations learned by modern machine learning systems remains a central challenge for interpretable AI. We introduce a geometric framework for comparing supervised human concepts with unsupervised intermediate representations extracted from foundation model embeddings. Motivated by the role of conceptual leaps in scientific discovery,...
38
Disentangled Graph Prompting for Out-Of-Distribution Detection
Cheng Yang, Yu Hao, Qi Zhang et al. (4 authors)
📅 2026-03-31
When testing data and training data come from different distributions, deep neural networks (DNNs) will face significant safety risks in practical applications. Therefore, out-of-distribution (OOD) detection techniques, which can identify OOD samples at test time and alert the system, are urgently needed. Existing graph OOD detection methods usually characterize fine-grained in-distribution (ID)...
39
Central limit theorems for the outputs of fully convolutional neural networks with time series input
Annika Betken, Giorgio Micali, Johannes Schmidt-Hieber
📅 2026-03-31
Deep learning is widely deployed for time series learning tasks such as classification and forecasting. Despite the empirical successes, only little theory has been developed so far in the time series context. In this work, we prove that if the network inputs are generated from short-range dependent linear processes, the outputs of fully convolutional neural networks (FCNs) with global average...
40
The Geometry of Polynomial Group Convolutional Neural Networks
Yacoub Hendi, Daniel Persson, Magdalena Larfors
📅 2026-03-31
We study polynomial group convolutional neural networks (PGCNNs) for an arbitrary finite group $G$. In particular, we introduce a new mathematical framework for PGCNNs using the language of graded group algebras. This framework yields two natural parametrizations of the architecture, based on Hadamard and Kronecker products, related by a linear map. We compute the dimension of the associated...
41
Total Variation Guarantees for Sampling with Stochastic Localization
Jakob Kellermann
📅 2026-03-31
Motivated by the success of score-based generative models, a number of diffusion-based algorithms have recently been proposed for the problem of sampling from a probability measure whose unnormalized density can be accessed. Among them, Grenioux et al. introduced SLIPS, a sampling algorithm based on Stochastic Localization. While SLIPS exhibits strong empirical performance, no rigorous...
42
Capturing Multivariate Dependencies of EV Charging Events: From Parametric Copulas to Neural Density Estimation
Martin Výboh, Gabriela Grmanová
📅 2026-03-31
Accurate event-based modeling of electric vehicle (EV) charging is essential for grid reliability and smart-charging design. While traditional statistical methods capture marginal distributions, they often fail to model the complex, non-linear dependencies between charging variables, specifically arrival times, durations, and energy demand. This paper addresses this gap by introducing the first...
43
Bringing Up a Bilingual BabyLM: Investigating Multilingual Language Acquisition Using Small-Scale Models
Linda Zeng, Steven Y. Feng, Michael C. Frank
📅 2026-03-31
Multilingualism is incredibly common around the world, leading to many important theoretical and practical questions about how children learn multiple languages at once. For example, does multilingual acquisition lead to delays in learning? Are there better and worse ways to structure multilingual input? Many correlational studies address these questions, but it is surprisingly difficult to get...
44
Learning Surrogate LPV State-Space Models with Uncertainty Quantification
E. Javier Olucha, Valentin Preda, Amritam Das et al. (4 authors)
📅 2026-03-31
The Linear Parameter-Varying (LPV) framework enables the construction of surrogate models of complex nonlinear and high-dimensional systems, facilitating efficient stability and performance analysis together with controller design. Despite significant advances in data-driven LPV modelling, existing approaches do not quantify the uncertainty of the obtained LPV models. Consequently, assessing...
45
Sampling at intermediate temperatures is optimal for training large language models in protein structure prediction
L. Ghiringhelli, A. Zambon, G. Tiana
📅 2026-03-31
We investigate the parameter space of transformer models trained on protein sequence data using a statistical mechanics framework, sampling the loss landscape at varying temperatures by Langevin dynamics to characterize the low-loss manifold and understand the mechanisms underlying the superior performance of transformers in protein structure prediction. We find that, at variance with feedforward...
46
Baby Scale: Investigating Models Trained on Individual Children's Language Input
Steven Y. Feng, Alvin W. M. Tan, Michael C. Frank
📅 2026-03-31
Modern language models (LMs) must be trained on many orders of magnitude more words of training data than human children receive before they begin to produce useful behavior. Assessing the nature and origins of this "data gap" requires benchmarking LMs on human-scale datasets to understand how linguistic knowledge emerges from children's natural training data. Using transcripts...
47
Variational Graph Neural Networks for Uncertainty Quantification in Inverse Problems
David Gonzalez, Alba Muixi, Beatriz Moya et al. (4 authors)
📅 2026-03-31
The increasingly wide use of deep machine learning techniques in computational mechanics has significantly accelerated simulations of problems that were considered unapproachable just a few years ago. However, in critical applications such as Digital Twins for engineering or medicine, fast responses are not enough; reliable results must also be provided. In certain cases, traditional...
48
Target-Aligned Reinforcement Learning
Leonard S. Pleiss, James Harrison, Maximilian Schiffer
📅 2026-03-31
Many reinforcement learning algorithms rely on target networks - lagged copies of the online network - to stabilize training. While effective, this mechanism introduces a fundamental stability-recency tradeoff: slower target updates improve stability but reduce the recency of learning signals, hindering convergence speed. We propose Target-Aligned Reinforcement Learning (TARL), a framework that...
49
Learning to Generate Formally Verifiable Step-by-Step Logic Reasoning via Structured Formal Intermediaries
Luoxin Chen, Yichi Zhou, Huishuai Zhang
📅 2026-03-31
Large language models (LLMs) have recently demonstrated impressive performance on complex, multi-step reasoning tasks, especially when post-trained with outcome-rewarded reinforcement learning Guo et al. 2025. However, it has been observed that outcome rewards often overlook flawed intermediate steps, leading to unreliable reasoning steps even when final answers are correct. To address this...
50
Model Predictive Path Integral PID Control for Learning-Based Path Following
Teruki Kato, Koshi Oishi, Seigo Ito
📅 2026-03-31
Classical proportional--integral--derivative (PID) control is widely employed in industrial applications; however, achieving higher performance often motivates the adoption of model predictive control (MPC). Although gradient-based methods are the standard for real-time optimization, sampling-based approaches have recently gained attention. In particular, model predictive path integral (MPPI)...