Performance Evaluation Archives - Futurex Solutions – All Things Finance

Skip to content Skip to footer

Close

AIJune 19, 2025

HtFLlib: A Unified Benchmarking Library for Evaluating Heterogeneous Federated Learning Methods Across Modalities

Multimodal Foundation Models Fall Short on Physical Reasoning: PHYX Benchmark Highlights Key Limitations in Visual and Symbolic Integration

AIApril 24, 2025

AWS Introduces SWE-PolyBench: A New Open-Source Multilingual Benchmark for Evaluating AI Coding Agents

AIApril 17, 2025

Model Performance Begins with Data: Researchers from Ai2 Release DataDecide—A Benchmark Suite to Understand Pretraining Data Impact Across 30K LLM Checkpoints

AIMarch 5, 2025

Researchers from FutureHouse and ScienceMachine Introduce BixBench: A Benchmark Designed to Evaluate AI Agents on Real-World Bioinformatics Task