AMD came in for some harsh criticism when it announced that its new Ryzen 5000 Mobile U-series processors would not all be using its latest core design. At the product announcement, we were told that some of the U-series processors would be based on the previous Zen 2 generation, and this was mainly for partners to take advantage of the new naming scheme but also reuse designs with the same ballpark performance. A number of tech enthusiasts (including myself, I have to say) scoffed at this as it made the whole system complex. It’s still complex, but we’ve come to understand that these latest Zen 2 based mobile processors also include a whole raft of updates that make them a better version of what they are.
To simplify things I’m going to call these products by their AMD codenames. The older Zen 2 processors are called Renoir, and the newer Zen 2 processors are called Lucienne. Here is a list of the new Ryzen 5000 U-Series, with Lucienne listed in yellow.
Renoir, for all intents and purposes, was a very successful product for AMD. Placed in the Ryzen 4000 Mobile series, it became the bedrock of AMD’s mobile portfolio and has been installed in around 100 design wins since it came to market. Lucienne on the other hand is a minor player in the latest Ryzen 5000 Mobile series. It doesn’t have the updates that the new Zen 3 cores have, but we have since learned that on the power side of things, rather than being a copy of Renoir, it is almost certainly Renoir Plus.
What Lucienne brings to the table over Renoir comes in discrete categories.
Memory Controller
The memory controller in Lucienne is now able to decouple its voltage from the cores and enter a lower power state when not in use or for low bandwidth reasons. This ultimately saves power, and AMD has enabled it to bypass particular voltage indicators to help it stay in the low voltage state. Aside from the cores and the graphics, the other two consumers of power inside a mobile processor is the internal communications and the external communications, of which the memory controller falls under the latter. AMD has also put into place a system by which the memory controller can wake to a full bandwidth state faster than before, enabling better responsivity from those deep sleep states.
On top of this, the memory controller can now support double the capacity of memory from Renoir: up to 64 GB of DDR4-3200, or up to 32 GB of LPDDR4X-4267. Using DDR4 means the system can have more peak memory, as well as being user adjustable, however LPDDR4X trades those in for faster bandwidth overall (68.4 GB/s vs 51.2 GB/s).
Per-Core Voltage Control
In similar circumstances to the memory controller, having voltage control of each individual core in a mobile processor is one angle to both maximize performance when needed and minimize power loss when idle. In Renoir, all of the cores can adjust their frequency, but they all had to run at the same voltage. Lucienne changes that such that each core can adjust its voltage independently, enabling a finer grained power management and a more optimal power-efficient system. There are also additional hooks that operating systems can use if it knows high performance cores are needed in advance.
Preferred Core
When we speak about turbo, historically it has been assumed that any core can reach the highest single core turbo frequency, and that the workload is sometimes shifted between cores to help with thermal management. When a system uses a preferred core however, it means that a system could be optimized for that specific core, and more performance extracted. AMD introduced its Preferred Core technology on the desktop two generations ago, and now it comes to the mobile processors. One core out of the eight on Lucienne silicon will be designated the best core, and through an OS driver (default in Windows) all workloads will be placed on that core preferentially.
Frequency Ramp
One of the features that tie all of this together is how quickly a core can move from idle to peak performance and back again. If a system takes too long to ramp up to speed, or ramp back down, then responsiveness and power is lost. A typical modern system is expected to ramp up from idle to peak frequency within two frames at 60 Hz, or 32 milliseconds, however the latest systems from AMD and Intel have done it much faster, often within 16 ms. AMD’s enhanced clock gating technology is now enabling Lucienne to reduce that down to 1-2 milliseconds. This means that a system could easily ramp up and down between each keystroke on a keyboard, supplying immediate responsiveness to a user while keeping the total power use down. In the 16-32 millisecond regime, typing on a keyboard may have meant a core being active almost continuously, however making this change faster affords a lot of power savings through these transitions.
Continuous Performance Levels
The legacy way for an operating system to command performance is through performance states, or P-states. In this instance the OS would request a specific level of power and performance from the processor based on its detected workload, and the processor would respond. This was originally implemented during a time when turbo was first coming to modern processors, and workload analysis was better done through the operating system. Now we can do this level of monitoring on the processor directly, and through an OS driver (already part of Windows), with system support that level of frequency control can be passed back down to the processor. The processor also gets an effective continuous distribution of performance, rather than discrete P-states.
While Renoir had P-states, Lucienne gets the benefit of CPU-level performance requests.
Faster Integrated Graphics
With the additional power control elsewhere on the core, how the power delivery works to the integrated graphics was also adjusted to allow for better regulation and ultimately a lower minimum voltage. Through firmware AMD has enabled a frequency sensitive prediction model that allows the GPU to adjust its voltage and frequency based on its dynamic energy management. Coupled with the better regulation and the power budget balancing done between CPU, interconnect, DRAM, and the GPU, more power budget is available for the GPU. For Lucienne, this means +150 MHz on the peak IGP speeds compared to Renoir.
Slide shows Cezanne numbers, but applies to Lucienne as well
But I thought Lucienne Silicon was the same as Renoir Silicon?
This is the big question. We asked AMD if Lucienne was the same stepping of Renoir, and the answer was not exactly committal in one direction or the other. The simple answer is yes, however AMD wants to make clear that substantial changes were made to firmware and manufacturing that means that despite the transistor layout being identical, there are features of Lucienne that would never have worked in Renoir without the changes that have been made.
So while yes it is the same silicon layout and floorplan, some of these features weren’t possible in Renoir. AMD built in these features perhaps knowing that they couldn’t be enabled in Renoir, but sufficient changes and improvements at the manufacturing stage and firmware stage were made such that these features were enabled in Lucienne. More often than not these ideas often have very strict time windows to implement, and even if they are designed in the hardware, there is a strict cut-off point by which time if it doesn’t work as intended, it doesn’t get enabled. Obviously the best result is to have everything work on time, but building CPUs is harder than we realize.
Sometimes I wonder how we ever get these rocks powered by lightning to work in the first place.
from AnandTech https://ift.tt/2Mqba2W
via IFTTT
No comments:
Post a Comment