Intel had a tech day this week to discuss the future of its own 14nm and 10nm chips, and it laid out some impressive claims in the process. It also laid out what it believes is a better metric for calculating process nodes in the future, though we suspect TSMC, Samsung, and GloFo may all disagree.
Intel’s presentation focused mainly on two related topics: The strength and characteristics of its 10nm node, and its proposal for a new (or rather, a return to an old) method of calculating process node scaling.
Intel drives “hyperscaling” at the 10nm node
Intel is claiming that its 10nm node will deliver a 2.7x improvement in transistor density compared with its 14nm products. That’s a significant jump over its 14nm products, and it’s not just the result of the improvements to various semiconductor manufacturing metrics. Intel has improved its 10nm scaling compared with 14nm through the use of two specific new technologies — single dummy gate, and contact-over-active-gate. Typically, logic cells use a pair of what are called “dummy gates” to isolate each cell from its neighbors. Intel has found a method of using just one dummy gate instead of using a pair of them, and has managed to recover significant space savings as a result.
Each gate has multiple contacts that connect to the metal layers within a CPU. Typically, these contacts are offset from the gate. At 10nm, Intel is moving the contact to directly under the gate, which frees up additional transistor space. All of these advantages combined is why Intel is claiming such significant improvements compared with 14nm. The slides below includes visualizations of these improvements and how they collectively improve Intel’s 10nm process node. All slides can be clicked on to expand them in a new window.
The general thrust of Intel’s argument is that these new technologies will give it a significant lead over its rivals and competitors in the foundry space.
Intel vs. everyone at the 10nm node
One major point of Intel’s presentation is actually something we’ve discussed before: When TSMC, Samsung, GlobalFoundries and Intel talk about 14nm, the process nodes they discuss are not identical. Each foundry has implemented different feature sizes and each of their nodes (16nm/12nm for TSMC, 14nm for Samsung and GF) has its own distinct characteristics. To some extent, the modern definition of a node has become “The bucket of technologies we periodically dump in to improve some aspects of transistor density, power consumption, and performance enough to justify moving to a new number.” The problem, at least for Intel, is that when a company like Samsung announces 10nm production, that’s actually roughly equivalent to Intel’s 14nm, which debuted in 2014.
To get around this issue, Intel has described a method for calculating process node size that would weight certain factors differently than they are weighted today. Intel’s Process Architecture and Integration Director, Mark Bohr, described the formula in an editorial published by Intel:
One simple metric is gate pitch (gate width plus spacing between transistor gates) multiplied by minimum metal pitch (interconnect line width plus spacing between lines), but this doesn’t incorporate logic cell design, which affects the true transistor density. Another metric, gate pitch multiplied by logic cell height, is a step in the right direction with regard to this deficiency. But neither of these takes into account some second order design rules…
At the other extreme, simply taking the total transistor count of a chip and dividing by its area is not meaningful because of the large number of design decisions that can affect it – factors such as cache sizes and performance targets can cause great variations in this value.
It’s time to resurrect a metric that was used in the past but fell out of favor several nodes ago. It is based on the transistor density of standard logic cells and includes weighting factors that account for typical designs. While there is a large variety of standard cells in any library, we can take one ubiquitous, very simple one – a 2-input NAND cell (4 transistors) – and one that is more complex but also very common: a scan flip flop (SFF). This leads to a previously accepted formula for transistor density:
(The weightings 0.6 and 0.4 reflect the ratio of very small and very large cells in typical designs.)
Every chip maker, when referring to a process node, should disclose its logic transistor density in units of MTr/mm2 (millions of transistors per square millimeter) as measured by this simple formula. Reverse engineering firms can readily verify the data.
Measuring process node metrics in this fashion would give Intel an obvious advantage over all of its rivals. As such, it’s hard to imagine they’ll ever endorse it. Whether its TSMC with a new “12nm” node or potential future gaps between the various pure-play foundries at 7nm, nobody but Intel will substantially benefit from measuring process nodes differently, even if Intel’s method is more accurate.
The slides below steps through some of Intel’s transistor density comparisons against its competitors’ nodes, as well as how it compares against them on specific metrics and how large it expects the gaps to be at 10nm.
We can’t speak to the accuracy of Intel’s projections for its competitors at 10nm, but the company’s general projection matches what we’ve seen at the 14nm node. With Intel taking on more foundry customers (or trying to), it doubtlessly sees this as a potential barrier to those efforts. What’s the point of calling a node “10nm” (from Intel’s perspective) if that artificially inflates the value of what your competitors’ are selling and diminishes your own value?
The other interesting question is how much these advantages will practically matter going forward. When Intel intended to break into the tablet and smartphone markets to compete directly against ARM, it had an obvious rationale for comparing its own performance against Samsung, GlobalFoundries, and TSMC. Today, Intel and the pure-play foundries once again move in different circles — but that could change in the future if Intel can land some significant foundry deals.
The other reason foundry die shrinks aren’t as interesting as they used to be is for the simple reason that they don’t deliver much in the way of improved performance any more, at least not outside specific market segments. Six years ago, the Core i5-2537M was a Sandy Bridge-era core clocked at 1.4GHz base / 2.3GHz Turbo with a TDP of 17W. Today, Intel’s Core i5-7300U is a 15W chip with a 2.6GHz base clock and a 3.5GHz Turbo. It’s TDP is 15W. Intel, in other words, improved the base clock by 1.86x and the boost clock by 1.52x all while trimming the TDP.
But in the desktop space, performance gains have been rarer, overclocking headroom has shrunk, and frequency gains have been doled out very sparingly. Intel has absolutely made progress over this time, but not at anything like a quick pace. Again, that’s not the company’s fault — it’s related, as we’ve said before, more to the difficulties with scaling silicon than anything — but with AMD surging back into the CPU fight, Intel may find itself under increased pressure to demonstrate that its 10nm hardware can do more than look impressively futuristic in a variety of graphs. Intel’s full presentations are available for download, if you’d care to flip through them yourself.