AIJune 17, 2025OpenBMB Releases MiniCPM4: Ultra-Efficient Language Models for Edge Devices with Sparse Attention and Fast Inference
AIMay 20, 2025Meta Introduces KernelLLM: An 8B LLM that Translates PyTorch Modules into Efficient Triton GPU Kernels