By Jianping Jiang, Head of Product Marketing, CXL Switch, Marvell
The AI memory wall—the widening gap between the memory capacity and bandwidth AI infrastructure wants and the amount that conventional memory architectures can deliver—is accelerating at an alarming pace.
And the consequences are getting increasingly ominous for data center operators and their customers: idle XPUs, underutilized equipment, longer processing times, higher costs, and ultimately a lower return on investment. Meanwhile, memory—already second only to GPUs in datacenter semiconductor spend1—continues to soar in price.
The Marvell® StructeraTM S family of Compute Express Link (CXL) switches scale the memory wall by providing a pathway for adding terabytes of shareable memory to infrastructure and dynamically allocating bandwidth and capacity to boost utilization and application performance. CXL switches don’t just boost memory and memory capacity; they enable data center operators to use it more wisely too.
Structera S is the successor to the groundbreaking Apollo line of CXL switches developed by XConn Technologies, now part of Marvell. Structera S 20256 for PCIe Gen 5.0/CXL 2.0 (previously the XConn Apollo I) became the first commercially available CXL switch upon its release last year.
Marvell is expanding the family with Structera S 30260 for PCIe 6.0/CXL 3.x. Structera S 30260 features support for 16 or 32 CPUs or GPUs over 260 lanes with up to 48TB of shared memory and 4TB/second cumulative bandwidth. Marvell is showcasing Structera S 30260 in a live demonstration this week at OFC 2026 and plans on sampling to customers in 3Q 2026.
By Krishna Mallampati, Senior Director of Product Marketing, Data Center Switching, Marvell
Since its introduction in 2004, PCIe® has become the most popular interconnect for low-latency chip-to-chip connections. From its humble beginnings for fan-out interconnects, PCIe has been integrated into AI and cloud servers, JBOF storage systems, ADAS systems in automotive, industrial automation, PCs, and other platforms.
Scale-up AI servers—which can contain hundreds of processors spread over multiple racks—represent the next logical step for PCIe. Although far larger than today’s single chassis AI servers, scale-up servers demand the same thing from interconnect fabrics: coherent, low-latency links that enable fast, secure communication between components. PCIe’s status as a widely-used standard that evolves to meet customer demands further puts it in the forefront for scale-up.
Let’s explore the PCIe scale-up usage model and how these architectures will evolve.
PCIe Scale-up Usage Model

By Xi Wang, Senior Vice President and General Manager of the Connectivity Business Unit, Marvell
Marvell has become a founding member of the eXtra dense Pluggable Optics (XPO) Multi-Source Agreement (MSA), an industry initiative organized by Arista Networks to define a new optical transceiver form factor purpose-built for AI-scale infrastructure.
The XPO concept is designed to dramatically increase bandwidth density by enabling liquid cooling at the module level. XPO modules are substantially larger in size than octal small form factor pluggable (OFSP) modules commonly deployed in today’s data centers, but they deliver a step-function increase in performance. Each XPO module integrates 64 lanes operating at 200 Gbps, eight times more than current pluggable modules for a total of 12.8 Tbps of bandwidth per module.1
This leap in bandwidth is enabled in part by an integrated cold plate that can deliver up to 400W of cooling per module. The combination of larger modules, significantly higher lane counts, and liquid cooling delivers a four-fold increase in bandwidth density for switches across scale-up, scale-out or scale-across network architecture.
By Khurram Malik, Senior Director of Marketing, Custom Cloud Solutions, Marvell
Can AI beat a human at the game of twenty questions? Yes.
And can a server enhanced by CXL beat an AI server without it? Yes, and by a wide margin.
While CXL technology was originally developed for general-purpose cloud servers, the technology is now finding a home in AI as a vehicle for economically and efficiently boosting the performance of AI infrastructure. To this end, Marvell has been conducting benchmark tests on different AI use cases.
In December, Marvell, Samsung and Liqid showed how Marvell® StructeraTM A CXL compute accelerators can reduce the time required for conducting vector searches (for analyzing unstructured data within documents) by more than 5x.
In February, Marvell showed how a trio of Structera A CXL compute accelerators can process more queries per second than a cutting-edge server CPU and at a lower latency while leaving the host CPU open for different computing tasks.
Today, this blog post will show how Structera CXL memory expanders can boost performance of inference tasks.
AI and Memory Expansion
Unlike CXL compute accelerators, CXL memory expanders do not contain additional processing cores for near-memory computing. Instead, they supersize memory capacity and bandwidth. Marvell Structera X, released last year, provides a path for adding up to 4TB of DDR5 DRAM or 6TB of DDR4 DRAM to servers (12TB with integrated LZ4 compression) along with 200GB/second of additional bandwidth. Multiple Structera X modules, moreover, can be added to a single server; CXL modules slot into PCIe ports rather than the more limited DIMM slots used for memory.

By Khurram Malik, Senior Director of Marketing, Custom Cloud Solutions, Marvell
While CXL technology was originally developed for general-purpose cloud servers, it is now emerging as a key enabler for boosting the performance and ROI of AI infrastructure.
The logic is straightforward. Training and inference require rapid access to massive amounts of data. However, the memory channels on today’s XPUs and CPUs struggle to keep pace, creating the so-called “memory wall” that slows processing. CXL breaks this bottleneck by leveraging available PCIe ports to deliver additional memory bandwidth, expand memory capacity and, in some cases, integrate near-memory processors. As an added advantage, CXL provides these benefits at a lower cost and lower power profile than the usual way of adding more processors.
To showcase these benefits, Marvell conducted benchmark tests across multiple use cases to demonstrate how CXL technology can elevate AI performance.
In December, Marvell and its partners showed how Marvell® StructeraTM A CXL compute accelerators can reduce the time required for vector searches used to analyze unstructured data within documents by more than 5x.
Here’s another one: CXL is deployed to lower latency.
Lower Latency? Through CXL?
At first glance, lower latency and CXL might seem contradictory. Memory connected through a CXL device sits farther from the processor than memory connected via local memory channels. With standard CXL devices, this typically results in higher latency between CXL memory and the primary processor.

Marvell Structera A CXL memory accelerator boards with and without heat sinks.