
Claude Opus 4.7 is here, and the long-context benchmarks got worse
Anthropic's Opus 4.7 is state-of-the-art on SWE-bench and CursorBench, but independent tests show regressions on long-context retrieval and thematic reasoning.
Artificial intelligence news, model releases, and lab announcements.

Anthropic's Opus 4.7 is state-of-the-art on SWE-bench and CursorBench, but independent tests show regressions on long-context retrieval and thematic reasoning.

Google shipped a native Swift Gemini app for macOS with screen sharing, voice, and Deep Research. Here's what it does, what it doesn't, and how it stacks up.

OpenAI's new cybersecurity-tuned model can reverse-engineer binaries and analyze malware. It's restricted to verified defenders through the Trusted Access program.

Adobe renamed Project Moonlight to Firefly AI Assistant and opened a public beta. It runs multi-step workflows across Photoshop, Premiere, Lightroom, and more.

Anthropic just shipped Routines: Claude Code sessions as cron jobs, webhooks, and GitHub-event reactors. Here's what they replace, what they don't, and one rule to follow.