Is One Layer Enough? Training A Single Transformer Layer Can Match Full-Parameter RL Training
Focuses on Is One Layer Enough? Training A Single Transformer Layer Can Match Full-Parameter RL Training.
Source archive
Fresh research papers across agents, reasoning, multimodal systems, and AI tooling. This source page gives readers and crawlers a stable route for the latest arXiv coverage.
Indexed briefings
1597
Latest source-linked updates, ordered newest first.
Latest
Focuses on Is One Layer Enough? Training A Single Transformer Layer Can Match Full-Parameter RL Training.
Focuses on The State-Prediction Separation Hypothesis.
Focuses on TiRex-2: Generalizing TiRex to Multivariate Data and Streaming.
Focuses on A Lightweight Self-Supervised Learning Framework for Multivariate Time Series using Hierarchical-JEPA on ECG Data.
Focuses on GAIA: Geometry-Adaptive Operator Learning for Forward and Inverse Problems.
Focuses on SynLaD: Latent Diffusion for Generating Synthesizable Molecules Conditioned on 3D Pharmacophore Profiles.
Focuses on Group-invariant Coresets for Data-efficient Active Learning.
Focuses on Agentic generation of verifiable rules for deterministic, self-expanding reaction classification.
Focuses on GeoSearcher: Anchor-Guided Progressive Reasoning for Remote Sensing Visual Grounding with Process Supervision.
Focuses on Understanding Large Language Models.
Focuses on PointSplat: Compact Gaussian Splatting via Human-Centric Prediction.
Focuses on KnowledgeDebugger -- an Exploration Tool for Knowledge Localization and Editing in Transformers.
Focuses on SpheRoPE: Zero-Shot Optimization-Free 360 Panorama Generation with Spherical RoPE.
Focuses on Deep Multitask Learning for Mixed-Type Outcomes with Shared Sparsity.
Focuses on Radial Suppression Accelerates Algorithmic Generalization: A Geometric Analysis of Delayed Generalization.
Focuses on Slope-Guided Mamba and Angular-Refined Transformer for Light Field Super-Resolution.
Focuses on LUNA: Learning Universal 3D Human Animation Beyond Skinning.
Focuses on Post-Training Pruning for Diffusion Transformers.
Focuses on Signed-Permutation Coordinate Transport for RMSNorm Transformers.
Focuses on Condensing Large-Scale Datasets Directly with Minimal Information Loss.
Focuses on FlexViT: A Flexible FPGA-based Accelerator for Edge Vision Transformers.
Focuses on MG-RWKV: Multi-Grained Context-Aware RWKV for Temporal Forgery Localization.
Focuses on Attend, Transform, or Silence: Operator-Level Visual Skipping for Efficient Multimodal LLM Inference.
Focuses on Geometry-Aware Cross-Height Channel Knowledge Map Prediction for UAV-Assisted Communications With Uncertainty-Guided 3D Sensing.
Focuses on Belief Contraction in Dynamic Epistemic Logic.
Focuses on Mirror-Fusion Attention for Reflection-Aware Self-Supervised Representation Learning.
Focuses on Review Residuals: Update-Conditioned Residual Gating for Transformers.
Focuses on Rethinking Multi-Label Image Classification With Deep Learning: Taxonomy, Challenge, and Outlook.
Focuses on Low-dimensional topology of deep neural networks.
Focuses on Towards High-Resolution Visual Perception via Hierarchical Entity Exploration.
Focuses on Explicit Fuzzy Logic in the Feed-Forward Layer: Self-Forgetting Quantifiers Discover Legible Grammatical-Licensing Detectors.
Focuses on SpiralFovea: Input-Adaptive Foveated Tokenization as a Third Lever of Resource-Adaptive Inference.
Focuses on An Agentic AI Framework to Accelerate Scientific Discovery in Plant Phenotyping.
Focuses on Soft Mixture-of-Recursions: Going Deeper with Recursive Vision Transformers.
Focuses on Generative Lane Topology Reasoning via Autoregressive Model with Geometry Prior.
Focuses on Decoupled Guidance: Disentangling Subject and Context Pathways in Text-to-Image Personalization.