Intel, SambaNova and Foxconn are building production-ready racks that split AI inference across three chip architectures.
Intel, SambaNova and Foxconn are building production-ready racks that split AI inference across three chip architectures.

Intel, SambaNova and Foxconn are building production-ready racks that split AI inference across three chip architectures.
Intel demonstrated a decoupled inference system at Computex 2026 in Taipei on Monday that separates the pre-filling and decoding phases of AI inference across different processors. The system, powered by Intel's Vector Core Compute data center platform and orchestrated by its Xeon 6 processors, uses SambaNova's SN40 RDU for decoding and Nvidia's Blackwell GPU for pre-filling. Foxconn, the world's largest electronics manufacturer, provided system integration support and exhibited production-ready racks at the show.
"This architecture lets customers optimize each phase of inference independently rather than forcing everything through a single GPU pipeline," an Intel representative said at the event. The approach targets a structural inefficiency in current AI deployments: pre-filling — the computationally intensive first pass that processes a user's prompt — and decoding — the token-by-token generation of a response — have different hardware requirements that a single chip type cannot satisfy efficiently.
The decoupled model addresses a growing pain point for enterprises running large language models in production. Pre-filling demands high memory bandwidth and matrix compute, where Nvidia's H100 and Blackwell GPUs excel. Decoding, by contrast, is more latency-sensitive and benefits from the specialized dataflow architecture of SambaNova's RDU (reconfigurable dataflow unit). By splitting the workload, Intel's Xeon 6 acts as the orchestrator, routing each phase to the optimal processor.
Intel also announced its Xeon 6+ processor line and agent cloud services for decoupled inference, extending its push into the AI data center market where Nvidia has commanded an estimated 80% of training and inference silicon revenue. The partnership with SambaNova — a startup valued at over $5 billion after its 2024 funding round — and Foxconn gives Intel a manufacturing and integration partner capable of delivering complete racks rather than just chips.
The timing is strategic. Nvidia used its own Computex keynote Monday to unveil the RTX Spark Superchip, its first consumer PC processor, and confirmed that its Vera Rubin data center platform has entered full production. Nvidia's data center revenue reached $35.6 billion in its most recent fiscal quarter, dwarfing Intel's data center and AI segment, which posted $4.1 billion. But Intel's bet on heterogeneous inference — using multiple chip types in a single workload — offers a differentiated value proposition for enterprises that want to avoid total vendor lock-in to Nvidia's CUDA ecosystem.
For investors, the question is whether Intel can convert this architecture into revenue. Intel's data center and AI revenue fell 8% year-over-year in its most recent quarter, and the company has struggled to regain share lost to Nvidia and AMD in AI compute. The Foxconn partnership provides a path to volume production: the contract manufacturer's ability to integrate, test and ship complete racks at scale could accelerate enterprise adoption. Intel shares have gained roughly 200% year-to-date on optimism around its turnaround, but the company still trades at a discount to Nvidia's 35x forward earnings multiple.
This article is for informational purposes only and does not constitute investment advice.