AMD Zen 7 'Florence' leak: 288 cores, a separate cache die, and a 25% IPC swing
Leaked slides describe Zen 7 EPYC scaling to 288 cores per socket, with all L3 moved off the CCD onto a stacked cache die. Tape-out is October 2026, launch late 2028.
AMD’s next-next CPU generation already has a roadmap leaking. New slides traced back to the Moore’s Law Is Dead YouTube channel describe Zen 7 EPYC, codenamed Florence, scaling to 288 cores per socket with a complete redesign of how cache lives on the chip. Two questions matter: what’s actually new, and how seriously to take a 2028 leak in April 2026.
What is Zen 7 actually?
Zen 7 is AMD’s planned successor to Zen 6, which itself isn’t out yet. Server parts are codenamed Florence, the desktop platform is Grimlock Ridge, and the laptop variant is a Halo-style design called Sound Wave in the leaks. AMD typically uses two-year cadence between architectures, and Zen 7’s leaked timeline matches: A0 tape-out in October 2026, mass production mid-2028, product launch around the end of 2028, per the slides reported by Notebookcheck.
The headline number is 288 cores per socket. That’s roughly 1.5× what Zen 6 EPYC “Venice” reportedly tops out at, and almost double what Zen 5 EPYC “Turin” ships today. But raw core count is the boring part. The interesting part is how AMD got there.
How the cache lives now
In every Zen design from 2017 to today, the L3 cache and the cores have shared a die. The CCD (Core Compute Die) was a self-contained cluster of cores plus its own L3, and AMD’s 3D V-Cache trick stacked extra L3 on top.
Zen 7’s Steamboat CCD throws that out. As HotHardware describes the leak, Steamboat has no L3 on the CCD at all. Every primary die is dedicated to CPU cores. The L3 goes on a separate 3D-stacked die underneath, called L3D, which carries up to 7 MB per core.
Multiply that out and a top-tier 288-core Florence has roughly 2 GB of L3 cache at the socket level. For context, that’s larger than the entire main memory of most desktops in the early 2000s, sitting on a cache die.
The architectural argument is straightforward. Cache and cores have very different ideal manufacturing recipes. Cores want the densest, fastest logic process. Cache wants something cheaper and more thermally tolerant. Splitting them into separate dies lets AMD print each on whatever node makes sense, and stack them with TSMC’s hybrid bonding. The trade is more packaging complexity in exchange for both better performance and better economics per core.
Why Florence reuses Zen 6 IO
The other leaked detail is that Florence remains compatible with the Dwarka IO die from Zen 6 and the Mathura memory die. That part is significant for AMD’s enterprise customers. SP7/SP8 socket compatibility means hyperscalers who just bought Zen 6 platforms in 2027 can drop Zen 7 chips into the same boards in 2028 without a full re-qualification cycle.
That’s the same play AMD ran with AM5 on the desktop. Long-lived sockets are now a serious feature, not just a marketing line, because hyperscale buyers price re-qualification at six-figure engineering hours per platform. We saw this dynamic in Meta’s 1-gigawatt MTIA commitment with Broadcom: when you’re buying chips at gigawatt scale, the cost of changing the rest of the rack matters more than the cost of the chip itself.
The IPC and accelerator story
The performance claim is 15-25% IPC gain over Zen 6, per TweakTown’s writeup. To put that in context, Zen 4 to Zen 5 was about 16% IPC. So 25% would be a bigger generational jump than AMD has historically shipped.
The other accelerator number: 4× FP8 throughput per cycle, and 2× INT8. Those are the data formats that matter for AI inference on CPUs. Right now most AI inference runs on GPUs or dedicated accelerators, and CPUs handle preprocessing, control flow, and the small models nobody wants to allocate a GPU for. A 4× FP8 jump moves the line on what’s “small enough to do on the CPU.” Combine that with 288 cores per socket and you have a server part that can absorb a meaningful chunk of inference work without a GPU touching it.
That’s relevant to Anker’s Thus chip story we covered yesterday, and to the broader compute-in-memory direction. The industry’s bet is that AI work fragments across many specialized substrates rather than living entirely on Nvidia GPUs. Zen 7’s accelerator pitch is AMD’s argument for keeping a piece of that work on the x86 server CPU.
How seriously to take a 2028 leak
The honest answer is: it’s directionally trustworthy, specifically tentative.
Moore’s Law Is Dead has a real track record on AMD roadmap leaks. Zen 4 and Zen 5 details lined up well in retrospect. But chip design slides this far out describe what an architecture team is trying to ship, not what it will actually ship. Targets slip. CCDs get cut. IPC numbers come down once silicon hits real workloads.
A0 tape-out in October 2026 is the only piece that’s hard to reschedule. If Florence taps out on time, the rest of the timeline gets the usual 12 to 24 months of real-world scrambling. So 2028 is the realistic window, and the 288-core, 2 GB-L3 number is the ceiling, not necessarily the configuration AMD ships at retail.
The pattern to watch is whether Intel keeps Florence honest. Intel’s foundry is finally bringing in real customers, with Tesla’s Terafab signing on for the 14A process earlier this month. If Intel’s own server roadmap, Diamond Rapids and beyond, lands on schedule on a competitive node, AMD’s leaked 25% IPC figure has to be evaluated against an Intel that’s no longer two nodes behind. That’s a different competitive frame than the one AMD has enjoyed since 2020.
Hyperscalers will get the answer before retail does. Florence sample silicon will reach Microsoft, Meta, AWS, and Google sometime in late 2027, and their internal procurement teams will benchmark it against whatever Intel and Nvidia are pitching for the same fiscal year. The leaks pricing out at “284 to 288 cores” matters less than whether Florence wins a meaningful chunk of that procurement on perf-per-watt, perf-per-rack, and total cost of running the rack for five years. AMD has won that bake-off three generations in a row. Zen 7 is the first one where the playing field looks different.
What this means for you
If you’re buying CPUs this year, none of this changes anything. Florence is two-and-a-half years out and SP7 boards qualified for Zen 6 will still be useful for at least one more cycle.
If you’re planning datacenter capacity for late 2028 and beyond, the planning input here is socket compatibility plus cache-density-per-watt. Zen 7 EPYC fits in Zen 6 boards. That’s an upgrade path, not a forklift, which changes the math on whether to buy 2027 capacity now or wait. The 2 GB of L3 per socket also reframes what cache-bound workloads (databases, in-memory analytics, some inference) cost: if you’re paying for memory bandwidth today, you may pay for less of it in 2028.
If you build software, the 4× FP8 number is the line worth tracking. CPU-side AI inference has been a niche use case for years. A real generational uplift is the thing that decides whether your team adds a CPU fallback path to its inference stack or keeps treating it as GPU-only. Bookmark October 2026’s tape-out as the first checkpoint.
Share this article
Sources
- AMD Zen 7 'Florence' leak teases 288-core Epyc chips and major laptop efficiency gains — Notebookcheck
- AMD Zen 7 Leak Reveals Major IPC Gains And Huge Cache Upgrades — HotHardware
- New AMD Zen 7 leak surfaces, up to 25% IPC upgrade over the unreleased Zen 6 — TweakTown
- AMD Zen 7 CPU Specifications Leak — OC3D
Frequently Asked
- When does Zen 7 actually ship?
- The leaked timeline puts A0 tape-out in October 2026, mass production in mid-2028, and product launch around the end of 2028. That's still a paper roadmap; no AMD official has confirmed it.
- Will Zen 7 EPYC drop into existing Zen 6 server boards?
- The leaks say yes. Florence reportedly reuses the Zen 6 IO die and pairs it with new Steamboat CCDs and stacked L3 cache dies, which means the platform stays SP7/SP8 compatible.
- Where does the 288-core figure come from?
- Each Steamboat CCD packs 36 cores. Eight CCDs in a top-tier EPYC SKU yields 288 cores per socket. Dual-socket servers would land near 576 cores per node if AMD ships those configs.
- How does this compare to Nvidia's roadmap?
- Nvidia's Vera Rubin is on a different axis: an ARM-based GPU-tightly-coupled CPU. Florence is a classic x86 server CPU with massive cache. Both target AI workloads, but Zen 7 is the upgrade for the customer who's still buying CPU sockets.