Start United States USA — IT Intel Releases Core Ultra H and U-Series Processors: Meteor Lake Brings AI...

Intel Releases Core Ultra H and U-Series Processors: Meteor Lake Brings AI and Arc to Ultra Thin Notebooks

171
0
TEILEN

Array
Intel has released their first mobile processors based on their highly anticipated Meteor Lake platform, the Core Ultra H and the Core Ultra U series. Available today, the Ultra Core H series has four SKUs, including two Ultra 7 16 core (6P+8E+2LP) chips and two 14 core (4P+8E+2LP) Ultra 5 chips. All run at a base TDP of 28 W, with a maximum turbo TDP of up to 115 W. The Core Ultra-H series is designed for ultra-portable notebooks but offers more performance in both computing and graphics within a slimline package.
Also announced is the Intel Core Ultra U-series, which includes four 15/57 W (base/turbo) SKUs, with two Core Ultra 7 and two Core Ultra 5 SKUs, and all coming with a variance in P, E-core and Intel’s latest integrated Arc Xe graphics frequencies. All of Intel’s announced Core Ultra U-series processors for mobile feature 10 CPU cores, with two Performance cores and eight Efficiency cores, making them ideal for lower-powered and ultra-thin notebooks.
The launch of Intel’s tile-based Meteor Lake SoC marks the first step in a series of power-efficient and AI-focused chips on Intel 4 for the mobile market, which is ultimately designed to cater to the growing need to utilize AI inferencing on-chip. Both the Intel Core Ultra H and U families of chips include two new Low Power Island (LP-E) cores for low insensitive workloads, with two Neural Compute Engines within the Intel AI NPU designed to tackle generative AI inferencing.Intel Core Ultra Processors: Quickly Recapping Meteor Lake
In September, Intel unveiled their chiplet-based Meteor SoC architecture during their annual Innovation event, which slices things up from a conventional monolithic processor into four individual tiles. Created using their Foveros 3D packaging, Intel is using a mix of process nodes to put together their first chiplet-based CPU. The most critical chiplet, the CPU tile, is being built on Intel’s EUV-based Intel 4 node, the latest and greatest fab tech out of the company, and which promises to provide robust gains in performance and energy efficiency compared to the long-standing Intel 7 process. Joining the CPU tile are tiles for the integrated GPU, SoC, and I/O functions, which are built on a mix of trailing-edge and even external process nodes.
Below is our deep dive into Meteor Lake as an SoC architecture, as well as all the key components such as the compute, I/O, graphics, and SoC tile:
Quickly recapping the Meteor Lake SoC architecture, it is essentially four interconnected tiles, including a compute, graphics, SoC, and an I/O tile. Within each of the tiles are a host of new advancements, including the Redwood Cove Performance (P) cores and Crestmont Efficiency (E) cores housed within the compute tile. On top of this, Intel also has a special variant of the E-core, called the Low Power Island or LP-E core, which is integrated into the SoC tile and is designed to tackle low-intensity workloads. Notably, because the SoC tile is essentially always active, the LP-E core is very cheap to use from an energy standpoint compared to powering up the CPU tile.
Meteor Lake is an upgrade and a significant architectural shift for Intel, moving away from traditional monolithic designs to a chiplet-based approach. This shift, leveraging Intel’s Foveros 3D packaging technology, introduces 3D chip stacking to overcome the limitations of 2D chip layouts. Like other shifts we’ve seen towards using chiplets, the architecture’s focus on disaggregation, power efficiency, and flexible silicon gives Intel new options for assembling CPUs out of individual blocks.
The architecture’s modular design facilitates scalable power management, which can be optimized, allowing each tile to operate independently, thereby maximizing performance and energy efficiency. This disaggregation also enables Intel to use different silicon processes for each tile, offering flexibility and cost savings in manufacturing. Meteor Lake’s use of Foveros packaging and low-power, low-distance die-to-die interconnects marks a departure from the Multi-Chip Packaging (MCP) used in the previous Raptor Lake mobile chips, allowing for more optimized power usage and chip customization.
Offering four differently built yet highly functional tiles, Intel’s Meteor Lake looks to increase customizations of their notebook SKUs in the future. Offering a tile solution enables Intel to amalgamate a variety of different engines, blocks, and tiles into one chip. Using their Foveros packaging technology also allows Intel to build chips differently, and more importantly, it means they aren’t limited to one specific manufacturing process – a hedge against problems with any one fab/node. Even in the present Core Ultra U and H series chips Intel is announcing today, the manufacturing choice for each of the tiles differs slightly, with the compute tile built on Intel 4 node, the graphics tile with Arc Xe graphics is built on TSMC’s N5 node, and the SoC and I/O tiles are built using TSMC’s N6 process. This flexibility means Intel can implement new technologies from different process nodes, tapping the benefits of any given node’s specialty (e.g. frequency or density) and not having to produce (and yield) an entire chip on a leading-edge process.
Taking a quick look at the underlying architecture, on the compute tile of the Ultra Core series in the first iteration of Meteor Lake, Intel is using two new CPU architectures within the heterogeneous design. Intel’s Meteor Lake compute tile is built using the Intel 4 node, and the process offers 2x the area scaling for the high-performance logic libraries compared to the previous Intel 7 node. The latest Performance cores are called Redwood Cove, which Intel claims brings new benefits over the previous Golden Cove P-core, including better per-watt performance efficiency, improved feedback through Intel Thread Director within Windows 11, more bandwidth, and improved performance monitoring capabilities. All of these improvements combined are designed to provide enhanced feedback to Thread Director to help optimize core performance and direct workloads to the right cores.
Notably, however, Intel hasn’t said anything about Redwood Cove’s IPC. Reading between the lines, we are left with the distinct impression that Redwood Cove’s IPC is similar (if not identical to) Golden Cove’s. And if that’s the case, it means Intel won’t be moving the needle on single threaded performance in this generation – at least, not in TDP-unconstrained scenarios. In fact, the peak P-core clockspeeds for Core Ultra (Meteor Lake) chips are lower than 13th Gen Core Mobile (Raptor Lake) chips – 5.1GHz vs. 5.4GHz – so it’s entirely plausible that some Core Ultra chips could lose in single-threaded CPU benchmarks to 13th Gen Core chips. All of which is to say that while Intel should still pick up some real-world performance here due to the energy efficiency improvements, Redwood Cove is more of a side-grade in terms of architecture.
Intel’s efficiency cores, on the other hand, should deliver a major improvement. Intel is using their Crestmont cores for the efficiency cores, which Intel claims bring IPC gains over their 13th Gen Raptor Lake E-cores, as well as AI acceleration optimizations in VNNI and ISA, as well as enhanced feedback to Intel Thread Director. For low-intensive workloads, Intel includes two new Low Power Island (LP-E) cores on each of the announced SKUs housed within the SoC tile and allows light workloads to be offloaded onto these cores to enhance overall power efficiency. Intel Thread Director with Windows 11 is a key component in ensuring the right workloads go onto the right cores for the best performance and power efficiency levels.
Another key component (or tile) within the heterogeneous Meteor Lake SoC design is an upgrade to Intel’s Arc graphics architecture. Built on TSMC’s N5 node, the graphics core of choice is the Arc Xe-LPG core, which is a derivative of Intel’s discrete Xe-HPG GPU architecture. The GPU tile is comprised of 16 Vector Engines with a 256-bit bus width and 192 KB of shared L1 cache per core. Each Vector Engine can perform 16 FP32 and 32 FP16 operations per clock, along with a shared FP64 execution port capable of 64 INT8 operations per clock. Also featured is a dedicated FP64 unit, which is new compared to Raptor Lake (13th Gen) and has pairs of vector engines operating in lockstep for improved efficiency.

Continue reading...