AIApril 14, 2025Reasoning Models Know When They’re Right: NYU Researchers Introduce a Hidden-State Probe That Enables Efficient Self-Verification and Reduces Token Usage by 24%
AIApril 11, 2025Balancing Accuracy and Efficiency in Language Models: A Two-Phase RL Post-Training Approach for Concise Reasoning
AIApril 3, 2025Snowflake Proposes ExCoT: A Novel AI Framework that Iteratively Optimizes Open-Source LLMs by Combining CoT Reasoning with off-Policy and on-Policy DPO, Relying Solely on Execution Accuracy as Feedback
AIFebruary 10, 2025Tutorial to Fine-Tuning Mistral 7B with QLoRA Using Axolotl for Efficient LLM Training
AIJanuary 20, 2025Google AI Proposes a Fundamental Framework for Inference-Time Scaling in Diffusion Models