AIMay 1, 2025Multimodal AI on Developer GPUs: Alibaba Releases Qwen2.5-Omni-3B with 50% Lower VRAM Usage and Nearly-7B Model Performance
AIApril 23, 2025Muon Optimizer Significantly Accelerates Grokking in Transformers: Microsoft Researchers Explore Optimizer Influence on Delayed Generalization
AIApril 22, 2025LLMs Can Now Retain High Accuracy at 2-Bit Precision: Researchers from UNC Chapel Hill Introduce TACQ, a Task-Aware Quantization Approach that Preserves Critical Weight Circuits for Compression Without Performance Loss
AIApril 14, 2025Reasoning Models Know When They’re Right: NYU Researchers Introduce a Hidden-State Probe That Enables Efficient Self-Verification and Reduces Token Usage by 24%
AIMarch 24, 2025Lyra: A Computationally Efficient Subquadratic Architecture for Biological Sequence Modeling