OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration Paper • 2602.05400 • Published 7 days ago • 276
Locate, Steer, and Improve: A Practical Survey of Actionable Mechanistic Interpretability in Large Language Models Paper • 2601.14004 • Published 22 days ago • 46
ToolPRMBench: Evaluating and Advancing Process Reward Models for Tool-using Agents Paper • 2601.12294 • Published 25 days ago • 17
ToolPRMBench: Evaluating and Advancing Process Reward Models for Tool-using Agents Paper • 2601.12294 • Published 25 days ago • 17
ToolPRMBench: Evaluating and Advancing Process Reward Models for Tool-using Agents Paper • 2601.12294 • Published 25 days ago • 17
RubricHub: A Comprehensive and Highly Discriminative Rubric Dataset via Automated Coarse-to-Fine Generation Paper • 2601.08430 • Published 29 days ago • 59
EnvScaler: Scaling Tool-Interactive Environments for LLM Agent via Programmatic Synthesis Paper • 2601.05808 • Published Jan 9 • 36