#AI Benchmarks
5 articles with this tag

Artificial Intelligence
Exa Unveils New Code Search Benchmarks
Exa.ai releases 'WebCode', a new benchmark suite for evaluating search performance in coding agents, addressing limitations in existing tools.
8 days ago

Technology
Poetiq's Small Team Sets New Together AI Benchmark Records
Poetiq's small team is setting new Together AI benchmark records by leveraging recursively self-improving meta-systems to optimize existing LLMs.
about 1 month ago

Artificial Intelligence
Claude Sonnet 4.6 Ups the AI Ante
Anthropic's Claude Sonnet 4.6 launches with major upgrades in coding, reasoning, and computer use, plus a 1M token context window.
about 1 month ago

AI Research
Step 3.5 Flash: AI's New Efficiency Standard
Step 3.5 Flash AI model revolutionizes AI efficiency with a 196B parameter foundation and 11B active parameters, offering competitive performance with lower latency.
about 2 months ago

Artificial Intelligence
Claude Opus 4.6: Smarter, Faster, and Longer Context
Anthropic's Claude Opus 4.6 launches with a 1M token context window, enhanced coding, and state-of-the-art benchmark performance.
about 2 months ago